Thread: Last line of a data file not wanted to be read in and the while statement

  1. #1
    Registered User
    Join Date
    Feb 2010
    Posts
    84

    Wink Last line of a data file not wanted to be read in and the while statement

    Ok, I am utilizing an while statement to read a data file. Caveat: the last line of the data file is useless and contains just a eccentric time stamp and the file name to signify its the last record.

    My initial method was to utilize this prior to the while loop:
    Code:
    flag = fseek(ifp, fpi, SEEK_END);
    
    elfi = ftell(ifp);
    Then read in the last line as a string and use strcpy. Then can use this expression in the while loop so when it gets to this line it can compare strings and stop - thus not reading it in. Problem though is when setting the file position to the end, the position is offset bytes from the place and I don't think its quite what I want, thus it still reading the last line and breaking my statement to read in the data.

    Any efficient ideas to this problem?

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Just read in the last line, and see what it always has that marks it. Maybe a time stamp has a colon, which your good data never has, or an AM/a.m. or PM/p.m. group of char's.

    You could also keep a "look ahead" buffer of say 3 lines of code. When it reaches EOF, then just count down and use 2 lines of data, only. So maybe:

    read 3 lines of data (number can vary, nothing in concrete about that). You'll need a char tenLines[3][85], kind of 2D array, of course.

    Then normal processing of your data, including getting a new line to keep 3 lines ahead of the data you're processing. After every line you process, a line, you will memcpy or strcpy line 1 to line 0, and line 2 to line 1, and get a new line for line 2.

    It's almost harder to explain it that to do it. Have fun!

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by Adak View Post
    Just read in the last line, and see what it always has that marks it. Maybe a time stamp has a colon, which your good data never has, or an AM/a.m. or PM/p.m. group of char's.

    You could also keep a "look ahead" buffer of say 3 lines of code. When it reaches EOF, then just count down and use 2 lines of data, only. So maybe:

    read 3 lines of data (number can vary, nothing in concrete about that). You'll need a char tenLines[3][85], kind of 2D array, of course.

    Then normal processing of your data, including getting a new line to keep 3 lines ahead of the data you're processing. After every line you process, a line, you will memcpy or strcpy line 1 to line 0, and line 2 to line 1, and get a new line for line 2.

    It's almost harder to explain it that to do it. Have fun!
    Why 3 lines?

    2 buffers
    Code:
    read to buffer 0
    current = 0;
    while(fgets(to buffer[1-current])
    {
      process buffer[current]
    
       current = 1-current;
    }
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  4. #4
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I'm cutting it down, Vart!!

    My first look ahead buffer was 10 lines! No, I'm not kidding.

  5. #5
    Registered User
    Join Date
    Feb 2010
    Posts
    84
    Quote Originally Posted by vart View Post
    Why 3 lines?

    2 buffers
    Code:
    read to buffer 0
    current = 0;
    while(fgets(to buffer[1-current])
    {
      process buffer[current]
    
       current = 1-current;
    }
    Just trying to understand your code...

    What is "read to buffer 0"?

    I understand that current is a line counter and "process buffer[current]" is where i am reading in that line and parsing the data (correct?). Will current be negative in this? If first current is 0, then in while loop its current = 1 - 0?

  6. #6
    Ultraviolence Connoisseur
    Join Date
    Mar 2004
    Posts
    555
    Code:
    flag = fseek(ifp, fpi, SEEK_END);
    
    elfi = ftell(ifp);
    Here is an example that illustrates what you mentioned in the first post, it is indeed possible and quite easy:

    Code:
    #include <stdio.h>
    
    int main(int argc, char ** argv)
    {
        FILE * fp = NULL;
        char buf[BUFSIZ+1];
        long pos = 0;
    
        if (argc<2)
            return -1;
    
        for (;argc>1;--argc) {
            if (!(fp = fopen(argv[argc-1],"r")))
                continue;
            /* find end position */
            fseek(fp,0L,SEEK_END);
            pos = ftell(fp);
            /* rewind for reading */
            rewind(fp);
            while (fgets(buf,BUFSIZ,fp) && !ferror(fp)) {
                if (ftell(fp) == pos) {
                    /* here you could put a continue for example, to skip outputting the last line */
                    printf("############## JUST BEFORE LAST LINE #################\n");
                }
                printf("%s",buf);
            }
            fclose(fp);
        }
    
        return 0;
    }
    I honestly don't know how fast/slow it is to seek to end and then rewind, so perhaps the double buffer idea is smarter. Perhaps an exercise would be to implement both methods and do some benchmarking.
    Last edited by nonpuz; 03-01-2010 at 07:56 PM. Reason: added comment to code

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by towed View Post
    Just trying to understand your code...

    What is "read to buffer 0"?

    I understand that current is a line counter and "process buffer[current]" is where i am reading in that line and parsing the data (correct?). Will current be negative in this? If first current is 0, then in while loop its current = 1 - 0?
    char buffer[2][96] is an array of type char, with rows and columns - 2Dimensions.

    So a read to buffer 0 is a read of a line of data to fgets(buffer[0], 85, filePointer);

    current is not a counter, but an index into the array. It won't be negative, since arrays have no negative indeces.

    It's a simple idea. You're queuing up the data, into a line - like for a movie ticket. First the data goes into the "back of the line" buffer row, then you write out the current data at the front row of the buffer (the front of the line), (since obviously, it's not the end of the file), and after they're processed, the line moves forward one position, and you get a new line to go into the back of the buffer row.

    I can't say which will be faster. Ideally, the buffer will be a length which is best for the architecture of your hardware and OS, and compiler. Multiples of 64 or 32 are generally "best".

    If you want to do your program this way, just throw up some code and we'll sort it out. It is a simple idea, and worth learning (as is the ftell() method, btw).

Popular pages Recent additions subscribe to a feed