Thread: Line matching

  1. #1
    Registered User
    Join Date
    Jun 2009
    Posts
    486

    Line matching

    Hey,

    I have a large textfile that contains data that changes over time. Each 'timestep', it starts a new section of the file, beginning with a ling that conatins a single integer value and nothing else. These lines are irregularly spaced since each timestep contains different amounts of information as the system evolves. I need a function that will count the number of timesteps (ei, count the number of lines that contain a single integer value and nothing else). How can I do this?


    I though about using fgets to read in each line, but I have never worked with strings like that so I wouldn't know how to test the result properly. Is there a library function that can tell you if everything in a given string is a number with no decimal place?

    Thanks

  2. #2
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Look into the scanf (sscanf, fscanf, etc..) family of functions - they read formatted input. So what you would do is ask it to scan a decimal integer. If it's a match, it'll return 1 and place the number in the variable you specified. If not, it returns 0, and you'll know it's not that line.

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    An important element of this is: how much of a role can you play in the structure of the file? It sounds as if the answer is "none" because it is output by some other program not written by you, but I just want to check.

    Quote Originally Posted by KBriggs View Post
    I though about using fgets to read in each line, but I have never worked with strings like that so I wouldn't know how to test the result properly. Is there a library function that can tell you if everything in a given string is a number with no decimal place?
    Very likely, using fgets or fread is the way to go. I would not bother looking for some special library to do this -- learning to implement it won't be any easier than just learning to parse a string, which if you have any desire to use the C language for programming anything, you should really get comfortable with strings.

    One thing I have done for tasks like this is to write a custom "get line" type function using fread:
    Code:
    int fileline_base (int fp, char *line) {
          int c=0;
          char byte;
          while (fread(&byte, 1, 1, fp) {
               if (byte == ' ')   etc [ character by character processing ]
               line[c] = byte; c++;
          }
          line[c] = '\0';  /* must */
          return (some informative value, eg, the type of line)
    }
    I called this "fileline_base" because it is possible you will want to have different functions to process different kinds of lines *if* you can predict what kind of line is coming next based on the file structure. Otherwise, you could use fileline_base to get the line, but use the character by character processing to identify the "kind" of line, and pass it on to (eg) fileline_timestamp, which might break it up and put appropriate details into a struct.

    There's a lot of useful things that working byte by byte makes possible. You could use fgets and then process the line the same way in a loop, but it is probably more efficient to do it on the initial read.

    You could also use fgets() with sscanf(). That would allow you to distinguish one kind of line from another -- if the first sscanf fails, move on to the next one. That will be much less prone to error (and more flexible) than if you just fscanf the file.
    Last edited by MK27; 06-09-2009 at 08:02 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    Is there something I have to link at compile time to include sscanf? I #included <stdio.h> but I get an error saying:

    /tmp/ccuKnQDb.o: In function `countTimeSteps':
    chargestate.c.text+0x33): undefined reference to `sccanf'
    collect2: ld returned 1 exit status

    EDIT: LOL, a sad face in an error message


    Also, will this work? I am unsure of my error-checking:

    Code:
    int countTimeSteps(FILE *input)
    {
      char *line;
      int check,x,N_timeSteps;
      N_timeSteps = 0;
      
      while (fgets(line, MAXNUM, input) != NULL)
      {
        check = sccanf(input,"%d",&x);
        if (check != 0 && check != EOF)
        {
          N_timeSteps++;
        }
      }
      return N_timeSteps;
    }
    Last edited by KBriggs; 06-09-2009 at 08:29 AM.

  5. #5
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    Spell it correctly: sscanf.

    Remember to check spelling if you get an odd linker error like this.

  6. #6
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    Heh, silly me

    No, it doesn't work - it gives me 0, and I get a warning:

    chargestate.c: In function ‘countTimeSteps’:
    chargestate.c:15: warning: passing argument 1 of ‘sscanf’ from incompatible pointer type


    Which I don't know how to intepret

    EDIT: it is reading in the lines, because adding a puts(line) prints the output file just fine...
    Last edited by KBriggs; 06-09-2009 at 08:33 AM.

  7. #7
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    You are probably after fscanf().

    sscanf() reads from a string (char pointer), and you are passing it a FILE pointer. That's why it says incompatible pointer type.

  8. #8
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Or... you should be passing it line instead of input. line is the string that was read, input was the file it came from. You've already read the line, so you wouldn't want fscanf reading more lines. fscanf is to validate the contents of a line.

  9. #9
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Haha yes of course. Sorry about that.

    Didn't read the whole code snippet.

  10. #10
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    oo, perfect. Wrong pointer haha. Works now

  11. #11
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    ACtually, I have another question about this. Here is my working code:

    Code:
    int countTimeSteps(FILE *input)
    {
      char line [MAXNUM];
      int check,x,N_timeSteps;
      N_timeSteps = 1; //since there is one for time == 0, start at 1
      fgets(line, MAXNUM, input);
      
      while (fgets(line, MAXNUM, input) != NULL) //read in file 1 line at a time
      {
        //puts(line);
        check = sscanf(line,"%d",&x); //x is useless, only there to make sscanf stop whining about arguments
        if (check != 0 && check != EOF) //if the line contains an integer value at the start
        {
          ++N_timeSteps;
        }
      }
      return N_timeSteps;
    }
    My question is this:

    right now, the code will count any line that starts with an integer. Is there a way to specify that it count only lines that contain ONLY a single integer, rather than lines the simply begin with one? For the current case it doesn't matter, but it would help make the code a little more robust.

  12. #12
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by KBriggs View Post
    right now, the code will count any line that starts with an integer. Is there a way to specify that it count only lines that contain ONLY a single integer, rather than lines the simply begin with one? For the current case it doesn't matter, but it would help make the code a little more robust.
    By "integer" you mean one number (eg, 234123) and not a single digit? Because a single digit will be simple.

    Otherwise, depending on what else could be in the string, you could add a second parameter to sscanf and check the number of items returned.


    BTW, you can use
    Code:
    sscanf("%*d");
    The * means ignore, so you don't need x. Since you are not keeping the number, you could also test this mo' better:
    Code:
    if ((line[0] >= '0') && (line[0] <= '9'))
    The ASCII values of digits are guaranteed to be sequential.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  13. #13
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    If you want to make sure there's nothing else, you could, instead of sscanf, use strtol -- strtol gives you a pointer to "what's left over".

  14. #14
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    Looking at strtol in the documentation, I really don't understand how it would be used

    Got a one-liner that encapsulates the necessary test for me? :P

  15. #15
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Code:
    long strtol(const char* s, char** endp, int base);
    You pass your line in as s, pass some pointer in as endp, and (if you want) the base of the number in as base (most likely 10).

    The function will return the value of that number, and endp will be null unless there was data in your string IN ADDITION to the number. Make sense?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Pointer and Polymorphism help.
    By Skyy in forum C++ Programming
    Replies: 29
    Last Post: 12-18-2008, 09:17 PM
  2. Printing Length of Input and the Limited Input
    By dnguyen1022 in forum C Programming
    Replies: 33
    Last Post: 11-29-2008, 04:13 PM
  3. Finding carriage returns (\c) in a line
    By JizJizJiz in forum C++ Programming
    Replies: 37
    Last Post: 07-19-2006, 05:44 PM
  4. Adding Line numbers in Word
    By Mister C in forum A Brief History of Cprogramming.com
    Replies: 24
    Last Post: 06-24-2004, 08:45 PM
  5. Contest Results - May 27, 2002
    By ygfperson in forum A Brief History of Cprogramming.com
    Replies: 18
    Last Post: 06-18-2002, 01:27 PM