Thread: Strings in C++

  1. #1
    Registered User
    Join Date
    Jan 2003
    Posts
    42

    Strings in C++

    I come from a background of Java. It's sort of my "native language", since I learned it first in college. In Java, strings are beatiful, flexible, graceful things that are ever so easy to manipulate.

    I really hate worrying about having an array of characters. I don't like working with them at all. I much prefer the thought of just having "a string". So I immediately hopped into the C++ string.

    ...and was dissapointed. Is this really all the string functions (I know that there are more functions that work with character arrays, but I'd really prefer to use simply "string") that there are? A lot of the character array functions didn't even carry over, such as converting to lower case and getting tokens. Sure, one could convert to lower case in two simple lines (or one, if you want it ugly), but it gets annoying to write my own tokenizer function.

    Is this really all there is, or is the list incomplete?
    http://www.cppreference.com/cppstring.html

    Just wondering.

  2. #2
    You could look into the string header file, although I do not know anything else besides it being able to creat a string variable.

  3. #3
    Registered User
    Join Date
    Sep 2002
    Posts
    272
    This is where the C++ multi-paradigm approach can get a bit messy. Functions such as toupper, tolower are said not to belong to a string object but as part of a locale (there is a set of template functions for manipulating strings using different char sets in cctype). Also, as the string class is part of the STL; there's a set of generic functions for calling global functions on an object. Using tolower would look something like -

    Code:
    #include <iostream>
    #include <string>
    #include <algorithm>
    #include <cctype>
    
    using namespace std;
    int main() 
    {
    	string name = "JOE";
    	transform(name.begin(),name.end(),name.begin(),tolower);
    	cout << name;
    
      return 0;
    }
    Joe

  4. #4
    Registered User
    Join Date
    Jan 2003
    Posts
    42
    Yikes. That can sure look confusing. What I've done is written a bunch of "utility" functions that I can call without having to think about it, because they work in the background. I'll stick them in a header so I can use them in other programs I write.

    Examples
    Code:
    //initializes the tokenizer
    string tokenize(string str1, const char *str2)
    {
        string answer = strtok((char*)str1.c_str(), str2);
        return answer;
    }
    
    //finds the next token in the tokenizer
    string tokenize(const char *str2)
    {
        string answer = strtok(NULL, str2);
        return answer;
    }
    
    //returns the number of tokens in a string
    int tokensIn(string str)
    {
        int count = 1;
        if(str.empty())
            return 0;
        for(int i = 0; i < str.size(); i++)
            if(str.at(i) == ' ')
                count++;
        return count;
    }
    
    //converts string to lower case
    void stringToLower(string &str)
    {
        for(int i = 0; i < str.size(); i++)
            str.at(i) = tolower(str.at(i));
    }

  5. #5
    Registered User
    Join Date
    Jan 2003
    Posts
    311
    don't cast away const, particularly a const you did not write yourself. Your tokenizer can break in a very subtle manner as str1's destructor is called while strtok's static char* still points to it's buffer. If I may suggest.

    Code:
    std::string extract_token(std::string &str, const std::string &sep=" \t") {
        typedef std::string::size_type pos_t;
        pos_t start = str.find_first_not_of(sep);
        if(start == std::string::npos) {
            str.clear();   
            return "";
        }
        pos_t end = str.find_first_of(sep,start); // returns npos on failure
        std::string token = str.substr(start,end-start); // if end==npos end-start is huge
        str.erase(0,end);
        return token; 
    }
    this takes a string and a string of seperator characters and returns the first token and strips the first token off the string you passed it. This is a lot more usefull than strtok() as you can mix extract_token() calls to different strings independantly, rather than being sure you have called strtok(NULL) as often as you are ever going to need to.

    The other two handy parsing tricks to know about are stringstream, because many tokens are numbers, and the boost
    librarys in particular for regex++, though they also have a fancy tokenizer.

  6. #6
    Registered User
    Join Date
    Jan 2003
    Posts
    311
    Two more nit's, `cause I just can't shut up, tokensIn does not count tokens, it counts spaces. Second while I applaud paranoid programming str.at(i) performs a pointless bounds check. str[i] is mildly faster, anytime you use at() you really should also have the try catch block in place. Although obviously its much better to use .at() too often then not enough

  7. #7
    Registered User
    Join Date
    Jan 2003
    Posts
    42
    Yeah, I noticed that one a while ago on that part with the token counter. When I slapped it together, I overlooked that I needed a bit more code. I'll have to make sure I rewrite those helpful little functions before they cause trouble.

    And thanks for pointing out that problem with my tokenizer. I hadn't noticed that one.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Strings Program
    By limergal in forum C++ Programming
    Replies: 4
    Last Post: 12-02-2006, 03:24 PM
  2. Programming using strings
    By jlu0418 in forum C++ Programming
    Replies: 5
    Last Post: 11-26-2006, 08:07 PM
  3. Reading strings input by the user...
    By Cmuppet in forum C Programming
    Replies: 13
    Last Post: 07-21-2004, 06:37 AM
  4. damn strings
    By jmzl666 in forum C Programming
    Replies: 10
    Last Post: 06-24-2002, 02:09 AM
  5. menus and strings
    By garycastillo in forum C Programming
    Replies: 3
    Last Post: 04-29-2002, 11:23 AM