Thread: how to parse a string

  1. #1
    Registered User
    Join Date
    Jan 2008
    Posts
    569

    how to parse a string

    somefile : anotherfile

    and

    somefile:anotherfile

    I would like to have a function that would work for both of these string

    I want to separate this string into

    somefile
    :
    anotherfile

    what should I use?

    if I use strtok(line, ":"), then it wouldn't work right?

  2. #2
    Registered User
    Join Date
    Nov 2007
    Posts
    39
    Why wouldn't that work? You can do something like this:
    Code:
    char *str;
    while (str != NULL) {
      str = strtok(line, ":"); // <- this will store "somefile"
      str = strtok(NULL, ":"); // <- now str will contain "anotherfile"
    }
    I havent seen strtok used much but I'm pretty sure that's how you would use it.

  3. #3
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    but one gives me somefile with an extra space:

    somefile_

    and the other one just gives me

    somefile

    right?

  4. #4
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    oh I think you're right.. now comes another question, how do I check that after some string, there is a : ?

  5. #5
    Ex scientia vera
    Join Date
    Sep 2007
    Posts
    477
    scanf("%s:%s", somevar, anothervar);

    would do that job for you just fine.

    If you're reading from files, use fscanf.

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Have you looked into strchr?

  7. #7
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    with this code:

    Code:
    char *str;
    while (str != NULL) {
      str = strtok(line, ":"); // <- this will store "somefile"
      str = strtok(NULL, ":"); // <- now str will contain "anotherfile"
    }
    why does the first time we should pass line to the strtok and the rest not?

  8. #8
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Easy answer: because that's the way strtok works. Slightly less easy answer: The idea behind the NULL pointer is that strtok saves where the last search ended, so that it can resume where it left off. If you pass in a string, it's going to assume that you want to start over.

  9. #9
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    Code:
    char* pch = strtok(line, ":");
    	  if (pch == NULL){
    	    ..............
    	  }
    	
          	  int h;
              for (h = 0; pch != NULL; h++){
    	     if (h == 0){
    	       if (strtok(pch, " ") != NULL)
    		 ..........
    	       else
    	        ...........
    		
    		
    	     }
    	     else{
    	      ............
    	     }
               printf("PCH BEFORE IS &#37;s\n", pch);
    	    pch = strtok(line, " ");
    	    printf("PCH IS %s\n", pch);
              }
    	   
           }
    if that's so what's wrong with this code

    when it tries to

    process through this file

    Code:
    haha : hehe
    	........
    
    hehe :
    	............

  10. #10
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    First of all, omitting lines for the sake of brevity is not very helpful. It begs the question what are you actually doing? Instead, consider providing a reasonably sized, complete example of the problem you are attempting to solve. For example, these are all good questions. I hope this advice helps you the next time you need to ask something.

    strtok is an okay solution for simple patterns like this I suppose, though there might be issues, especially if you have blank fields (lines with just ":"). If that is the case, you might want to look toward sscanf or strcspn for a parser.

    Though, it appears you are passing strtok the token instead of something like NULL to keep searching in the same place or a different delimeter. As in real life, if you keep looking in the same place for the same thing and it ain't there, you'll never find it.
    Code:
    char * pch = strtok( haystack, needle );  
    if ( pch != NULL )
      pch = strtok( pch, needle ); 
      /** woops, probably not going to find the same token again **/

  11. #11
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    well what I am trying to do here is this:

    say I have the following string

    cat : dog mouse mice hot

    I want to parse

    cat dog mouse mice and hot

  12. #12
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Fair enough. You were pretty close with your attempt. Seems like with some slight changes you'd be on your way. Notice that I added a space in the delimiter string.

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    int main ( void )
    {
        char parse[] = "cat : dog mouse mice hot";
        char * tok = NULL;
    
        tok = strtok( parse, ": " );
        if( tok != NULL ) {
            puts( tok );
            while ( ( tok = strtok( NULL, ": " ) ) != NULL ) {
                puts( tok );
            }
        }
        
        return 0;
    }
    cat
    dog
    mouse
    mice
    hot

  13. #13
    Registered User
    Join Date
    Jan 2008
    Posts
    569
    will this code work too for :

    cat:dog mouse mice hot

    because I want to get cat without the space

    cat only not cat_

  14. #14
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by -EquinoX- View Post
    with this code:

    Code:
    char *str;
    while (str != NULL) {
      str = strtok(line, ":"); // <- this will store "somefile"
      str = strtok(NULL, ":"); // <- now str will contain "anotherfile"
    }
    why does the first time we should pass line to the strtok and the rest not?
    This post contains an example of how strtok() would be implemented, and describes what NULL does.

    And yes, if you pass in multiple characters to strtok(), the first character that matches will be the one that "breaks" the string, so if you pass in ":.-+ ", it will break on the first of any of ':', '.', '-', etc.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  15. #15
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Easy answer: take your program, remove the space and see what happens.
    Other easy answer: yes, since strtok just looks for one of the characters in the delimiter string, not the whole string.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Interpreter.c
    By moussa in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 05:59 PM
  2. Calculator + LinkedList
    By maro009 in forum C++ Programming
    Replies: 20
    Last Post: 05-17-2005, 12:56 PM
  3. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  4. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM