Thread: A little more parsing help..

  1. #1
    Registered User
    Join Date
    Oct 2004
    Posts
    32

    A little more parsing help..

    Hello all,
    Having a little trouble with this one. I don't really know C, so i'm not too sure what's going on. Trying to edit an existing program to do what is needed.


    Have a file with:

    hello
    num123456
    num1232
    num09329823
    goodbye

    need to output to another file

    123456
    1232
    09329823


    pretty easy i'd imagine, but not for me. what i'm working with:


    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    void main(int arcg, char *argv[])
    {
     char file1[20] = " ";
     char file2[20] = " ";
     char instring[85];
     char volser[30] = " ";
     char *posch;
     FILE *infile, *outfile;
     int pos = 0;
    
        if (arcg != 3) {printf("\cut [file to scan] [file to write to]");
       return;
       }
    
     strcpy(file1, argv[1]);
     strcpy(file2, argv[2]);
    
     if ((infile = fopen(file1,"r")) == NULL) { printf("The file that you want\n");
     										printf("scanned is not found.  Try again.");
                                  return;}
      if ((outfile  = fopen(file2,"w")) == NULL){ fclose(infile); return;}
    
    
      		while (fscanf(infile,"%s", &instring) !=EOF)
          {
    
                if ( (posch = strstr( instring, "goodbye")) == NULL)
                     {
    
                    if ((posch = strstr( instring, "num")) != NULL)
                     {
                      strncpy(volser, &instring[15], (strlen(instring) - 0));
                      fprintf(outfile,"%s\n", volser);
    						strcpy(volser, "                             ");
    						strcpy(instring, "                                                                                        ");
                      }
    
                      }
          }
    
    
    
        fclose(outfile);
        fclose(infile);
    }

    as you can tell i'm having much trouble, like i said, it was an existing piece of code that did something similiar. I don't know how to ignore the "hello" line, and don't understand how this would know to skip the "num" statements before the numbers.


    I'd have to imagine there must be a better way of doing this.

    Any help would be appreciated. (wouldn't mind an explination of what's going on above either )


    Thanks

  2. #2
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    Rather than trying to figure out what to skip, why not figure out what to print? This seems easier:
    Code:
    itsme@itsme:~/C$ cat onlynums.c
    #include <stdio.h>
    #include <ctype.h>
    
    int main(void)
    {
      int c;
      int nl = 0;
    
      while((c = getchar()) != EOF)
      {
        if(isdigit(c))
        {
          putchar(c);
          nl = 1;
        }
        else if(c == '\n')
        {
          if(nl)
          {
            putchar('\n');
            nl = 0;
          }
        }
      }
    
      return 0;
    }
    Code:
    itsme@itsme:~/C$ cat inputfile
    hello
    num123456
    num1232
    num09329823
    goodbye
    Code:
    itsme@itsme:~/C$ ./onlynums < inputfile > outputfile
    itsme@itsme:~/C$ cat outputfile
    123456
    1232
    09329823
    itsme@itsme:~/C$
    Then it can handle whatever you throw at it:
    Code:
    itsme@itsme:~/C$ ./onlynums > outputfile
    h3110 th3r3 j00 l337 h4xx0r
    itsme@itsme:~/C$ cat outputfile
    3110330033740
    itsme@itsme:~/C$
    EDIT: Making the program handle file I/O is an exercise left to you
    Last edited by itsme86; 09-19-2005 at 12:11 PM.
    If you understand what you're doing, you're not learning anything.

  3. #3
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    itsme, I very much appreciate the help, I guess I should've been a little more descriptive though. The "num" is a standard, but what follows isn't always numeric.

    For example:

    hello
    num12ab456
    num1232
    num093r9823
    goobye

    is valid and should output:

    2ab456
    1232
    93r9823


    I assumed it would be read that way, my apologies. I figure if someone could help me with that portion I could figure out the rest (because in reality this is what the input files look like):

    hello
    num12b3
    non123902
    bcr129302
    num48203
    num12309d3
    hyr093023
    goodbye

    and i need:

    12b3
    48203
    12309d3



    Any other adivce? Thanks again

  4. #4
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Code:
    if ( (posch = strstr( instring, "goodbye")) == NULL)
    You don't need to save the return value of strstr() if you don't use it. Just put
    Code:
    if(strstr(instring, "goodbye") == NULL)
    Actually, why are you using "goodbye"? You can just check for EOF . . . unless you have multiple entries.
    Any other adivce?
    Post your code again.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  5. #5
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    That's actually even easier:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main(void)
    {
      char buf[BUFSIZ];
    
      while(fgets(buf, sizeof(buf), stdin))
        if(!strncmp(buf, "num", 3))
          fputs(buf + 3, stdout);
    
      return 0;
    }
    Code:
    itsme@itsme:~/C$ cat inputfile
    hello
    num12b3
    non123902
    bcr129302
    num48203
    num12309d3
    hyr093023
    goodbye
    itsme@itsme:~/C$ ./afternum < inputfile > outputfile
    itsme@itsme:~/C$ cat outputfile
    12b3
    48203
    12309d3
    itsme@itsme:~/C$
    If you understand what you're doing, you're not learning anything.

  6. #6
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    nevermind!

    i think this is it:

    Code:
        while(fgets(buf, sizeof(buf), infile))
         if(!strncmp(buf, "num", 3))
           fprintf(outfile, "%s", buf + 3);
    
        return;
    }
    correct?
    Last edited by chops11; 09-19-2005 at 04:27 PM.

  7. #7
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    main is int main().

    'unsigned int to constant char' . . . where? Highlight the line.

    Code:
    fscanf(infile, sizeof(buf), buf)
    Uhh . . . I know where now. Perhaps you mean fgets. Because fscanf takes a const char * as a second parameter, and you pass it sizeof(*) . . . an int.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  8. #8
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    Quote Originally Posted by dwks
    int main().

    'unsigned int to constant char' . . . where? Highlight the line.

    Code:
    fscanf(infile, sizeof(buf), buf)
    Uhh . . . I know where now. Perhaps you mean fgets. Because fscanf takes a const char * as a second parameter, and you pass it sizeof(*) . . . an int.

    Thanks for your help dwks, I believe I figured it out above, does that look correct?

  9. #9
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Hey! no fair . . . .

    [edit] You beat me again. Yes, that's better. Does it work? [/edit]
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  10. #10
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    Ran into another little snag with this one. Realized that two different streams may come through. For example i may have the original:

    hello
    num12b3
    non123902
    bcr129302
    num48203
    num12309d3
    hyr093023
    goodbye


    or just (without the hello and goodbye)
    num12b3
    non123902
    bcr129302
    num48203
    num12309d3
    hyr093023



    for the file with hello it should parse:
    12b3
    48203
    12309d3


    for the one without it, it should parse:
    num12b3
    non123902
    bcr129302
    num48203
    num12309d3
    hyr093023


    so basically, "hello" is the kicker. If it sees hello in the first row, it should only pull out what's after "num". If it doesn't see "hello" it should just pass the entire file to a new file, whether there are "num"'s or not.




    my thought would be something like this, but i could use some help:

    Code:
     
       while(fgets(buf, sizeof(buf), infile))
       //if (first line?) == hello 
        { 
        if(!strncmp(buf, "num", 3))
           fprintf(outfile, "%s", buf + 3);
    }
    
       else
          fprintf(outfile, "%s", buf);
        return;
    }
    something like that?


    thanks again guys and gals.

  11. #11
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    dwks don't you want to help me?

  12. #12
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Sorry, I had signed off.

    Just read the first line in. If it isn't hello, return from your function or whatever.

    Something like:
    Code:
    if(!fgets(s, sizeof(s), filepointer)) return;
    if(!strncmp(s, "hello", 5)) return;  /* be aware that this will match "hellothere" too */
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  13. #13
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Beginning from this, I might have made such changes as this:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    void foo(const char *filename)
    {
       FILE *file = fopen(filename, "r");
       if ( file != NULL )
       {
          char line[80], text[12];
          int ready = 0;
          while ( fgets(line, sizeof line, file) != NULL )
          {
             if ( strncmp(line, "goodbye", 7) == 0 )
             {
                ready = 0;
             }
             if ( ready && sscanf(line, "num%11s%*[^\n]%*c", text) == 1 )
             {
                puts(text);
             }
             else if ( strncmp(line, "hello", 5) == 0 )
             {
                ready = 1;
             }
          }
       }
       else
       {
          perror(filename);
       }
    }
    
    int main(void)
    {
       foo("file.txt");
       return 0;
    }
    
    /* file.txt
    hello
    num12b3
    non123902
    bcr129302
    num48203
    num12309d3
    hyr093023
    goodbye
    */
    
    /* my output
    12b3
    48203
    12309d3
    */
    But seeing more of what you are doing (and with some of the other good suggestions in this thread), I think I'd go with changing text to a pointer and using the following instead.
    Code:
             if ( ready && (text = strstr(line, "num"))  )
             {
                fputs(text + 3, stdout);
             }
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  14. #14
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    So what am I doing wrong here?

    Code:
      if(!fgets(buf, sizeof(buf), infile));  //first line of file right?
         if(!strncmp(buf, "hello", 5))    //if it's hello, move on
               {
           while(fgets(buf, sizeof(buf), infile))  //for the rest of the file
              if(!strncmp(buf, "num", 3))  //when it see's "num"
                   fprintf(outfile, "%s", buf + 3);  //write to a file what's after "num"
                }
      else //if it's not hello (this part seems to be working right
      {
         fprintf(outfile, "%s", buf);  //print the first line
       while(fgets(buf, sizeof(buf), infile))  //print every other line
         fprintf(outfile, "%s", buf);
    
        return;
        }
    }
    The else statement seems to be fine, but not the first if. Am i missing something?

    Thanks.

  15. #15
    Registered User
    Join Date
    Oct 2004
    Posts
    32
    I think I may've just gotten it going:

    Code:
     
    
     while(!fgets(buf, sizeof(buf), infile));  //first line of file right?
         if(!strncmp(buf, "hello", 5))    //if it's hello, move on
               {
           while(fgets(buf, sizeof(buf), infile))  //for the rest of the file
              if(!strncmp(buf, "num", 3))  //when it see's "num"
                   fprintf(outfile, "%s", buf + 3);  //write to a file what's after "num"
                }
      else //if it's not hello (this part seems to be working right
      {
         fprintf(outfile, "%s", buf);  //print the first line
       while(fgets(buf, sizeof(buf), infile))  //print every other line
         fprintf(outfile, "%s", buf);
    
        return;
        }
    this make sense to you who know more than me?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. need sth about parsing
    By Masterx in forum C++ Programming
    Replies: 6
    Last Post: 11-07-2008, 12:55 AM
  2. added start menu crashes game
    By avgprogamerjoe in forum Game Programming
    Replies: 6
    Last Post: 08-29-2007, 01:30 PM
  3. draw tree graph of yacc parsing
    By talz13 in forum C Programming
    Replies: 2
    Last Post: 07-23-2006, 01:33 AM
  4. Parsing for Dummies
    By MisterWonderful in forum C++ Programming
    Replies: 4
    Last Post: 03-08-2004, 05:31 PM
  5. I hate string parsing with a passion
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 03-19-2002, 07:30 PM