Thread: Tokenizing user input

  1. #1
    Registered User
    Join Date
    Sep 2008
    Posts
    33

    Tokenizing user input

    Here is what I am trying to do:

    User inputs a set of objects (not in the programming sense) that could be say an int, string, char, etc.

    For this case, we will assume ints, so input would be entered as follows:
    (1,3,4,1,2,5,2,6)

    The program is written using templates, so it wouldn't require much to change the input type.

    What I can't figure is how I should go about removing the ( , , , , ) so each number (without duplicates) can be inserted into my object which holds a dynamic array

    Hopefully that makes sense, if not I will try to clarify

  2. #2
    Registered User
    Join Date
    Oct 2008
    Posts
    55
    You could use strtok (look it up).
    It only works with C-style strings, though.
    Code:
    #include <iostream>
    using namespace std;
    
    int main()
    {
        char *substr;
        char str[] = "(1,3,4,1,2,5,2,6)";
        substr = strtok( str, "(),");
        while (substr)
        {
            cout << substr << endl;
            substr = strtok( 0, "(),");
        }
    }

  3. #3
    Registered User
    Join Date
    Sep 2008
    Posts
    33
    Ok this works, but I have 2 questions.

    1. Why a 0 as the first argument for strtok in the while loop? I have used them in C but doesn't make sense with a 0

    2. VS 2008 gives me the following warning on both lines with strtok applied:

    Warning 1 warning C4996: 'strtok': This function or variable may be unsafe. Consider using strtok_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.

  4. #4
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    C++ calls the null pointer 0, while C calls it NULL. (You can use NULL in C++ too, of course.)

  5. #5
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    I was really really bored, and wrote this is like 2 minutes, so please don't jump me to bad.
    Code:
    #include <vector>
    #include <string>
    std::vector<std::string> GetTokens(const std::string& str, char delimit = ',')
    {
    	std::vector<std::string> tokens;
    	std::string::const_iterator iter = str.begin();
    	int pos = 0;
    	while ( true )
    	{
    		if ( pos == str.npos )
    			break;
    		else
    		{
    			int pos2 = str.find(delimit,pos+1);
    			tokens.push_back(str.substr(pos+1,(pos2-pos)-1));
    		}
    		pos = str.find(delimit,pos+1);
    	}
    	return tokens;
    }
    			
    int main()
    {
    	std::vector<std::string> test = GetTokens(":123,2,3,4,5,6,7,8,9,10,11,12");
    	std::vector<std::string>::iterator iter = test.begin();
    	for (;iter != test.end(); ++iter )
    		std::cout << "Token: " << (*iter) << '\n';
    	system("pause");
    	return 0;
    }
    I realize this could be done much better.
    One thing to note is that it do require having the string having a starting character. I picked ':' just cause it is common but anything will work. first character is ignored.

    Edit: changed to allow specifying delimit character
    Last edited by Raigne; 11-04-2008 at 10:36 PM.

  6. #6
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    This one does not need the starting character.
    Code:
    std::vector<std::string> GetTokens2(const std::string& str, char delimit = ',' )
    {
    	std::vector<std::string> tokens;
    	int pos = -1;
    	int pos2 = 0;
    	while ( true )
    	{
    		if ( pos2 == str.length() || str.at(pos2) == delimit  )
    		{
    			tokens.push_back(str.substr(pos+1,pos2-pos-1));
    			pos = pos2;
    		}
    		if ( pos2 == str.length() )
    			break;
    		++pos2;
    	}
    	return tokens;
    }
    Works the same way basically.
    Code:
    std::vector<std::string> test = GetTokens2("123,2,3,4,5321353543,6,7,8,932,10,1112,12");
    	std::vector<std::string>::iterator iter = test.begin();
    	for (;iter != test.end(); ++iter )
    		std::cout << "Token: " << (*iter) << '\n';

  7. #7
    Registered User
    Join Date
    Oct 2008
    Posts
    55
    It looks okay, but you could use simpler loop control (see below). Also, what happened to the parentheses in the input?

    Your earlier questions can be answered by actually looking up "strtok" and "strtok_s" to see how they work and what the difference is.

    Code:
    vector<string> GetTokens2( const string& str, char delimit = ',')
    {
        vector<string> tokens;
        int pos_last = -1;
        for (int pos = 0; pos < str.length(); ++pos)
        {
            if (pos == str.length() || str.at(pos) == delimit)
            {
                tokens.push_back( str.substr( pos_last + 1, pos - pos_last - 1));
                pos_last = pos;
            }
        }
        return tokens;
    }

  8. #8
    Registered User
    Join Date
    Sep 2008
    Posts
    33
    Is there a way for me to pass the numbers as an int into a function

    Like for example the first solution posted the substring is a char* but I need it to be an int

    Cast? atoi?(dunno if this works in C++ or with char*)

  9. #9
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    C++ you can use stringstream to cast the string to int. Or make a different version of the function for whatever you want to convert it to.

    Nucleon: apparently lack of sleep, and working in the same day == bad programming. Thank you, and sorry for the horrible examples.

    edit: I was assuming for simplicities sake that the parentheses were parsed out, and only the contained data was passed to the tokenizer.
    Last edited by Raigne; 11-04-2008 at 11:40 PM.

  10. #10
    Registered User
    Join Date
    Oct 2008
    Posts
    55
    Whenever I see a while(true) (or the canonical for(;;)) I see if there's an easy way to get rid of it.
    And I thought you were the OP (forgot to check the username), so that's why I mentioned your "earlier questions".
    Last edited by nucleon; 11-05-2008 at 05:37 PM. Reason: get rid of spurious smiley

  11. #11
    Registered User
    Join Date
    Sep 2008
    Posts
    33
    I have a question on this code:

    Code:
    int main()
    {
        char *substr;
        char str[] = "(1,3,4,1,2,5,2,6)";
        substr = strtok( str, "(),");
        int x = atoi(substr);
        while (substr)
        {
                    cout << x;
    		substr = strtok(0, "(),");
    		x = atoi(substr);
    		
        }
    }
    When I run the program it prints out each of the numbers in str, but the program crashes and tells me the following:

    http://img383.imageshack.us/my.php?image=erroree7.jpg

  12. #12
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    atoi is very unhappy when you give it NULL instead of a string.

  13. #13
    Registered User
    Join Date
    Sep 2008
    Posts
    33
    Ok thanks, fixed it.

  14. #14
    Registered User
    Join Date
    Sep 2008
    Posts
    33
    Ok, this is on the same program, but not entirely related, just didn't want to make a new thread:

    This is my insert function:

    Code:
    template <class T>
    void Set<T>::insert(T val){
    	if(this->contains(val)) return;
    	else if(num==size){
    		size+=5;
    		T *arr2 = new T[size];
    		int i;
    		for(i=0; i<num; i++){
    			arr2[i]=arr[i];
    		}
    		arr = arr2;
    		arr[num] = val;
    		delete[] arr2;
    		num++;
    		this->sort();
    		return;
    	}
    	else{
    		arr[num] = val;
    		num++;
    		this->sort();
    		return;
    	}
    }
    And somewhere in the else if statement something isn't right.
    What it is supposed to do is if the array is out of space it allocates more by making a new one 5 spaces bigger, copying data from the original into the larger array then setting the original equal to the new larger array and deleting the temp.

    I am sure this worked earlier, and would resize just fine, but I must have made some changes because now all the output I get is long negative numbers (memory addresses?)

    Couple details: int num is the number of items IN the array already, so if there were 4 then we would insert the fifth at num(4)
    int size is the total number of spaces available, used or not in the array.
    Last edited by chinesepirate; 11-05-2008 at 09:43 PM.

  15. #15
    Registered User
    Join Date
    Nov 2005
    Posts
    673
    you need to change it around a bit.
    It should be
    Code:
    ...
    arr2[num] = val;
    delete [] arr;
    arr = arr2;

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Checking array for string
    By Ayreon in forum C Programming
    Replies: 87
    Last Post: 03-09-2009, 03:25 PM
  2. About aes
    By gumit in forum C Programming
    Replies: 13
    Last Post: 10-24-2006, 03:42 PM
  3. SSH Hacker Activity!! AAHHH!!
    By Kleid-0 in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 03-06-2005, 03:53 PM
  4. vectors and user input
    By Chaplin27 in forum C++ Programming
    Replies: 6
    Last Post: 01-17-2005, 10:23 AM
  5. Nested Structures - User Input
    By shazg2000 in forum C Programming
    Replies: 2
    Last Post: 01-09-2005, 10:53 AM