Thread: Converting C++ string to C style string

  1. #16
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    a few years ago, I did some profiling of the various methods to tokenize a string, and found that strtok was generally about 5-10% faster than the method that laserlight and I are suggesting, but at the expense of not being thread safe, and just generally being less safe overall. in short, if you're going to learn C++, learn the C++ way of doing things. you'll find that the result is much more readable than a mix of C and C++.
    What can this strange device be?
    When I touch it, it gives forth a sound
    It's got wires that vibrate and give music
    What can this thing be that I found?

  2. #17
    Registered User Vespasian's Avatar
    Join Date
    Aug 2011
    Posts
    181
    Quote Originally Posted by Elkvis View Post
    a few years ago, I did some profiling of the various methods to tokenize a string, and found that strtok was generally about 5-10% faster than the method that laserlight and I are suggesting, but at the expense of not being thread safe, and just generally being less safe overall. in short, if you're going to learn C++, learn the C++ way of doing things. you'll find that the result is much more readable than a mix of C and C++.
    I agree with this. However I come from a C learning background and so I am very slowly tapering to C++ and hence my code tends to have a hybrid approach for now. I am aiming for more C++ orientated code in the future

  3. #18
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You know, drawing on Elkvis' example in post #7, you could do something like this: Look ma, no need for manual memory management!
    O_o

    That is fine if you only need a single delimiter; otherwise, a `std::getline' loop is not a replacement for `strtok'.

    You can loop over `find_first_not_of' and `find_first_of' to get something more of a replacement for `strtok'.

    a few years ago, I did some profiling of the various methods to tokenize a string, and found that strtok was generally about 5-10% faster than the method that laserlight and I are suggesting, but at the expense of not being thread safe, and just generally being less safe overall.
    I'm curious if your `strtok' code was actually equivalent. I can easily see a simple `strtok' loop being faster--even more than 10%, but I have a difficult time believing a reliably measurable difference is found in copying the provided string, creating an array of pointers, allocating the individual strings, and copying the tokenized segments into the allocated strings.

    Soma

    [Edit]
    Code:
    #include <iterator>
    #include <string>
    
    struct token_iterator:
        public std::iterator<std::forward_iterator_tag, std::string>
    {
        token_iterator():
            mString()
          , mDelimiters()
          , mOrigin(std::string::npos)
          , mTerminus(std::string::npos)
        {
        }
        token_iterator
        (
            const std::string & fString
          , const std::string & fDelimiters
        ):
            mString(fString)
          , mDelimiters(fDelimiters)
          , mOrigin(fString.find_first_not_of(fDelimiters))
          , mTerminus(fString.find_first_of(fDelimiters, mOrigin))
        {
        }
        std::string operator * ()
        {
            return(mString.substr(mOrigin, mTerminus - mOrigin));
        }
        token_iterator & operator ++ ()
        {
            mOrigin = mString.find_first_not_of(mDelimiters, mTerminus);
            mTerminus = mString.find_first_of(mDelimiters, mOrigin);
            return(*this);
        }
        std::string mString;
        std::string mDelimiters;
        std::string::size_type mOrigin;
        std::string::size_type mTerminus;
    };
    
    bool operator !=
    (
        const token_iterator & fLHS
      , const token_iterator & fRHS
    )
    {
        return(fLHS.mOrigin != fRHS.mOrigin);
    }
    
    #include <iostream>
    #include <vector>
    
    int main()
    {
        using namespace std;
        vector<string> s(token_iterator("This is$only!a@test.", " $!@."), token_iterator());
        copy(s.begin(), s.end(), ostream_iterator<string>(cout, "\n"));
    }
    [/Edit]
    “Salem Was Wrong!” -- Pedant Necromancer
    “Four isn't random!” -- Gibbering Mouther

  4. #19
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Quote Originally Posted by phantomotap View Post
    I'm curious if your `strtok' code was actually equivalent. I can easily see a simple `strtok' loop being faster--even more than 10%, but I have a difficult time believing a reliably measurable difference is found in copying the provided string, creating an array of pointers, allocating the individual strings, and copying the tokenized segments into the allocated strings.
    the tests were done on code that was equivalent, relative to my specific needs at the time. both examples used a single delimiter, and added the resulting strings to a std::vector<std::string>. I can't find the code anymore, but if I recall, I was using const_cast on the return value of std::string::c_str() as well.
    What can this strange device be?
    When I touch it, it gives forth a sound
    It's got wires that vibrate and give music
    What can this thing be that I found?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C-Style String and its Length
    By Sammy2011 in forum C++ Programming
    Replies: 8
    Last Post: 11-02-2011, 12:18 AM
  2. converting string to int back to string
    By thenson in forum C Programming
    Replies: 4
    Last Post: 02-03-2008, 11:21 AM
  3. Replies: 1
    Last Post: 10-27-2005, 10:24 AM
  4. C++: Converting Numeric String to Alpha String
    By JosephCardsFan in forum C++ Programming
    Replies: 3
    Last Post: 02-16-2005, 07:07 AM
  5. c-style string vs. c++-style strings
    By Mbrio in forum C++ Programming
    Replies: 3
    Last Post: 02-10-2002, 12:26 PM

Tags for this Thread