Thread: Simple Line Parsing

  1. #1
    Registered User
    Join Date
    Dec 2005
    Posts
    141

    Simple Line Parsing

    Hi,

    I'm one month old in C and am here with a simple query. I want to parse a space delimited line in C and read each word and execute some functions based upon those words. e.g.

    Say I've :

    I am 27 years old\n

    I need to get I, am, 27, years and old in some variables. Also there is a little twist, if I've any numeral, I need it as a numeral and not a string.

    Could anyone please write a few lines of code and help me in doing this?

    Appreciate your guidance.

    Thanks,
    Angkar

  2. #2
    Registered Luser cwr's Avatar
    Join Date
    Jul 2005
    Location
    Sydney, Australia
    Posts
    869
    The best way to learn C is by writing your own code. Some hints:

    You can use strchr to find a character in a string (space for example).

    You can use strtol to convert part of a string to a number, and to determine whether it successfully converted.

  3. #3
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  4. #4
    Registered User
    Join Date
    Dec 2005
    Posts
    141

    Thanks

    Thanks Dave. And could you please explain what the following line is doing from the "=" till the end?

    printf("token[%2d] = \"%*.*s\"\n", i, (int)len, (int)len, token);

    Thanks,
    Angkar

  5. #5
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Code:
    printf("token[%2d] = \"%*.*s\"\n", i, (int)len, (int)len, token);
    A double-quote (escaped within a quoted string). A string to print. Minimum field width. Precision (the maximum number of characters to print for a string).
    Last edited by Dave_Sinkula; 12-26-2005 at 02:10 PM. Reason: Bah, colors.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  6. #6
    Registered User
    Join Date
    Dec 2005
    Posts
    141
    Also what's the significance of *token in the for loop condition?
    Lots of new things to learn.

    Thanks,
    angKar

  7. #7
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    It is checking for the null termination character.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  8. #8
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Why not just use strtok() to separate the tokens?
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  9. #9
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Perhaps one of these reasons.

    http://www.die.net/doc/linux/man/man3/strtok.3.html
    Never use these functions. If you do, note that:
    These functions modify their first argument.

    These functions cannot be used on constant strings.

    The identity of the delimiting character is lost.

    The strtok() function uses a static buffer while parsing, so it's not thread safe.
    Maybe just to avoid getting overly friendly with a function to perhaps avoid later on.

    [edit]Or perhaps to be able to detect empty fields.
    Last edited by Dave_Sinkula; 12-27-2005 at 02:20 PM.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  10. #10
    Registered User
    Join Date
    Dec 2005
    Posts
    141
    But suppose I don't want to print the separated tokens and just want to store them in an array. How to do that? (I mean, what function is an alternative to the %s manipulation you are doing in the printf?)

  11. #11
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Well, just store the variable somewhere instead of printing it.

    Like with sprintf().
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  12. #12
    Registered User
    Join Date
    Dec 2005
    Posts
    141
    My printf is showing null ! May be I'm not being clear:

    Say I've an i/p as :

    1 2 3
    1 2 4
    2 3 4

    separated by spaces and \n s.

    Now I use Dave's code as:
    Code:
    /*s is a single line i/p say 1 2 3 */
    while(fgets(s,1000,f)!=NULL) 
      {
          char *token=s;
          char *buffer;
          int i,n;
          
          printf("line %d:\n",++k);
          for(i=0;*token;i++)
          {
            size_t len= strcspn(token,"  \n");
    	n = sprintf(buffer,"token[%2d] = %*.*s\n", i, (int)len, (int)len,   token");
    	printf ("[%s] is a %d chars string\n",buffer,n);
    	token+=len+1 ;
          }
       }
    I'm getting buffer as null.Is my syntax correct? Also what I want is after first iteration of while loop buffer should be as buffer[0]=1, buffer[1]=2 and buffer[2]=3.

    Thanks,
    Angshu

  13. #13
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    buffer doesn't have any space allocated for it. It's just a pointer, which doesn't point anywhere. Actually, I should say, "It's just a pointer, which could be pointing anywhere." Since in fact you never initialize it, it is in fact pointing some place ... wherever.


    Quzah.
    Hope is the first step on the road to disappointment.

  14. #14
    Registered User
    Join Date
    Dec 2005
    Posts
    141
    Thanks. But how to solve that?

  15. #15
    Registered Luser cwr's Avatar
    Join Date
    Jul 2005
    Location
    Sydney, Australia
    Posts
    869
    By allocating space for buffer to point to, using malloc, or by changing buffer to a char array instead of a pointer.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. parsing command line strings
    By John_L in forum C Programming
    Replies: 15
    Last Post: 05-28-2008, 08:26 AM
  2. Imposing Line Numbers automatically in the C code
    By cavestine in forum C Programming
    Replies: 14
    Last Post: 10-15-2007, 12:41 AM
  3. Command Line Argument Parsing
    By lithium in forum Windows Programming
    Replies: 3
    Last Post: 07-13-2005, 07:01 PM
  4. Read only one line using seekg
    By RedZippo in forum C++ Programming
    Replies: 3
    Last Post: 03-31-2004, 11:10 PM
  5. if is faster than switch?
    By skorman00 in forum C++ Programming
    Replies: 32
    Last Post: 03-06-2004, 01:15 PM