Thread: String/Tokenizer problem.

  1. #1
    the Wizard
    Join Date
    Aug 2004
    Posts
    109

    String/Tokenizer problem.

    Hi,

    I'm trying to make a program that to start with, splits a string into smaller parts if there's spaces in the string. My only problem is that can make it split a string declared from the beginning of the program, but now make a user-defined one. And when afterwards how do I get it into some kind of array, so that I can work with them?

    Here's my work so far:

    Code:
    //includes
    #include <iostream>
    #include <string>
    #include <algorithm>
    #include <vector>
     
    //namespace
    using namespace std;
     
    //- Function name: Tokenize 
    //--------------------------
     
    void Tokenize(const string& str, vector<string>& tokens, const string& delimiters = " ") 
    {
    //Skip delimiters at beginning.
    string::size_type lastPos = str.find_first_not_of(delimiters, 0);
    //Find first "non-delimiter".
    string::size_type pos	 = str.find_first_of(delimiters, lastPos);
    //Cut the string.
    while(string::npos != pos || string::npos != lastPos)
    {
    //Found a token, add it to the vector.
    tokens.push_back(str.substr(lastPos, pos-lastPos));
    //Skip delimiters. Note the "not_of".
    lastPos = str.find_first_not_of(delimiters, pos);
    //Find next "non-delimiter".
    pos = str.find_first_of(delimiters, lastPos);
    }
    }
     
    int main() 
    {
    vector<string> tokens;
     
    string str("+x^2 +y^2 -4x +9y");
     
    Tokenize(str, tokens);
     
    copy(tokens.begin(), tokens.end(), ostream_iterator<string>(cout, "\n"));
    }
    -//Marc Poulsen -//MipZhaP

    He sat down, he programmed, he got an error...

  2. #2
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    1) Ask user for string
    2) Use getline to read string (and spaces)

    Code:
    int main() 
    {
        vector<string> tokens;
     
        string str;
    
        cout << "Enter your string: ";
        getline(cin,str);
     
        Tokenize(str, tokens);
     
        copy(tokens.begin(), tokens.end(), ostream_iterator<string>(cout, "\n"));
    }
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  3. #3
    the Wizard
    Join Date
    Aug 2004
    Posts
    109
    Thx dude. I've tried something simular. But I'll try it this way and see if it works
    -//Marc Poulsen -//MipZhaP

    He sat down, he programmed, he got an error...

  4. #4
    Not just a squid...
    Join Date
    Sep 2004
    Posts
    25
    I don't have any experience with vectors yet, so maybe this isnt an option, but since you are already including the string library why can't you just use the strtok function to split up your string?

  5. #5
    the Wizard
    Join Date
    Aug 2004
    Posts
    109
    TheSquid: strtok() is only used in C. Not C++ As far as I remember. And therefore you have to make your own in C++
    -//Marc Poulsen -//MipZhaP

    He sat down, he programmed, he got an error...

  6. #6
    the Wizard
    Join Date
    Aug 2004
    Posts
    109
    But what if I want to put the splitted into an array? Can somebody give me a hint on that?
    -//Marc Poulsen -//MipZhaP

    He sat down, he programmed, he got an error...

  7. #7
    Registered User
    Join Date
    Mar 2002
    Posts
    1,595
    >what if I want to put the splitted into an array? Can somebody give me a hint on that?Today 09:42 AM

    Try this. Change

    vector<string> tokens;

    to

    const int MAX = 100;
    string tokens[MAX];

    and change

    void Tokenize(const string& str, vector<string>& tokens, const string& delimiters = " ")

    to

    void Tokenize(const string& str, string tokens[], const string& delimiters = " ")
    You're only born perfect.

  8. #8
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I still recommend Boost's Tokenizer library.
    http://www.boost.org/
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  9. #9
    the Wizard
    Join Date
    Aug 2004
    Posts
    109
    What does that has to do with me wanting to but the splitted info into an array??
    As far as I can see, you get rid of the vector part for me. But how can that help me?
    -//Marc Poulsen -//MipZhaP

    He sat down, he programmed, he got an error...

  10. #10
    UT2004 Addict Kleid-0's Avatar
    Join Date
    Dec 2004
    Posts
    656
    I don't see what's so wrong with strtok:
    Code:
    #include <iostream>
    #include <string>
    #include <algorithm>
    #include <vector>
    #include <string.h>
    
    using std::cout;
    using std::endl;
    using std::string;
    using std::vector;
    
    
    // Convert the C Style string into a C++ Style string
    string toString( char *csz );
    
    // Tokenize a designated string with a certain delimiter
    vector<string> tokenize( string sz, char *d );
    
    
    int main( )
    {
      // tdb = Token vector database
      // sz = The string that we want to rip apart
    
      vector<string> tdb;
      string sz( "+x^2 +y^2 -4x +9y" );
    
      // Tokenize the string based on spaces
    
      tdb = tokenize( sz, " " );
    
      // Output all of the tokens
    
      for( int i = 0; i < tdb.size( ); i++ )
        cout << tdb[i] << endl;
    
      // Exit successfully
    
      return 0;
    }
    
    
    
    
    // Convert the C Style string into a C++ Style string
    string toString(
        char *csz )   // The C-Style string
    {
      // Make the C-String into a C++ string and return
    
      string sz( csz );
      return sz;
    }
    
    
    
    
    
    // Tokenize a designated string with a certain delimiter
    vector<string> tokenize(
        string sz,  // The string we want to tokenize
        char *d )   // The delimiter we use to tokenize
    {
      // tt = Holds the c string version of the string
      //      that needs to get tokenized
      // *t = The pointer to the current token brought from strtok()
      // tdb = Token vector database
    
      char cs[30];
      char *t;
      vector<string> tdb;
    
      // Put the string that needs tokenizing into a C Style string
    
      strcpy( cs, sz.c_str( ) );
    
      // Start ripping the tokens out of the string
    
      t = strtok( cs, " " );
      do
      {
         tdb.push_back( toString( t ) );
      } while( ( t = strtok ( NULL, " " ) ) != NULL );
    
      // Return the token vector database
    
      return tdb;
    }

  11. #11
    the Wizard
    Join Date
    Aug 2004
    Posts
    109
    Well it looks like it works, but when I try to make the string sz user-definitive, with the code:

    Code:
    cout << "Enter the equation: ";
    getline(cin,sz);
    I get these two errors:

    Code:
    d:\c++\circleequation\main.cpp(67) : error C2065: 'getline' : undeclared identifier
    d:\c++\circleequation\main.cpp(67) : error C2065: 'cin' : undeclared identifier
    What can be wrong? I've included what should be necessary.
    -//Marc Poulsen -//MipZhaP

    He sat down, he programmed, he got an error...

  12. #12
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Are you referencing the std namespace?
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  13. #13
    UT2004 Addict Kleid-0's Avatar
    Join Date
    Dec 2004
    Posts
    656
    Go like this as the top with all of the other std extractions:
    Code:
    using std::cin;
    Or you can go like this, but I don't recommend importing the whole std namespace for naming issues:
    Code:
    using namespace std;

  14. #14
    Carnivore ('-'v) Hunter2's Avatar
    Join Date
    May 2002
    Posts
    2,879
    >>strtok() is only used in C. Not C++
    It's included in C++, in <cstring> (not string.h!!!).

    >>What does that has to do with me wanting to but the splitted info into an array??
    I don't see what you mean. Each token goes into one element of the vector. If you replace the vector with an array, then each token goes into one element of the array. Therefore, the 'splitted info' has been put into an array. What's missing?
    Just Google It. √

    (\ /)
    ( . .)
    c(")(") This is bunny. Copy and paste bunny into your signature to help him gain world domination.

  15. #15
    UT2004 Addict Kleid-0's Avatar
    Join Date
    Dec 2004
    Posts
    656
    Quote Originally Posted by Hunter2
    >>strtok() is only used in C. Not C++
    It's included in C++, in <cstring> (not string.h!!!).
    I'm sorry, don't hurt me! lol

    Now everything's good:
    Code:
    #include <iostream>
    #include <string>
    #include <algorithm>
    #include <vector>
    #include <cstring>
    
    using std::cout;
    using std::endl;
    using std::string;
    using std::vector;
    using std::cin;
    
    
    // Convert the C Style string into a C++ Style string
    string toString( char *csz );
    
    // Tokenize a designated string with a certain delimiter
    vector<string> tokenize( string sz, char *d );
    
    
    int main( )
    {
      // tdb = Token vector database
      // sz = The string that we want to rip apart
    
      vector<string> tdb;
      string sz;
    
      // Have the user enter an equation
      cout << "Please enter a valid mathematical equation with spaces";
      cin >> sz;   
    
      // Tokenize the string based on spaces
    
      tdb = tokenize( sz, " " );
    
      // Output all of the tokens
    
      for( int i = 0; i < tdb.size( ); i++ )
        cout << tdb[i] << endl;
    
      // Exit successfully
    
      return 0;
    }
    
    
    
    
    // Convert the C Style string into a C++ Style string
    string toString(
        char *csz )   // The C-Style string
    {
      // Make the C-String into a C++ string and return
    
      string sz( csz );
      return sz;
    }
    
    
    
    
    
    // Tokenize a designated string with a certain delimiter
    vector<string> tokenize(
        string sz,  // The string we want to tokenize
        char *d )   // The delimiter we use to tokenize
    {
      // tt = Holds the c string version of the string
      //      that needs to get tokenized
      // *t = The pointer to the current token brought from strtok()
      // tdb = Token vector database
    
      char cs[30];
      char *t;
      vector<string> tdb;
    
      // Put the string that needs tokenizing into a C Style string
    
      strcpy( cs, sz.c_str( ) );
    
      // Start ripping the tokens out of the string
    
      t = strtok( cs, " " );
      do
      {
         tdb.push_back( toString( t ) );
      } while( ( t = strtok ( NULL, " " ) ) != NULL );
    
      // Return the token vector database
    
      return tdb;
    }

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Need help understanding a problem
    By dnguyen1022 in forum C++ Programming
    Replies: 2
    Last Post: 04-29-2009, 04:21 PM
  2. Memory problem with Borland C 3.1
    By AZ1699 in forum C Programming
    Replies: 16
    Last Post: 11-16-2007, 11:22 AM
  3. Someone having same problem with Code Block?
    By ofayto in forum C++ Programming
    Replies: 1
    Last Post: 07-12-2007, 08:38 AM
  4. A question related to strcmp
    By meili100 in forum C++ Programming
    Replies: 6
    Last Post: 07-07-2007, 02:51 PM
  5. WS_POPUP, continuation of old problem
    By blurrymadness in forum Windows Programming
    Replies: 1
    Last Post: 04-20-2007, 06:54 PM