Thread: EOF help

  1. #1
    Registered User
    Join Date
    Apr 2010
    Posts
    20

    EOF help

    Hi all.
    I am trying to read some data from a data file.However I need a way to calculate the number of lines read from the data file before the EOF is reached, as I need to then make my two arrays in my programme this times big.
    Would someone be able to tell me how I calculate the number of lines read before EOF is reached please? Many thanks for reading this!

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    If you are looking for a magic way that does not involve reading the file twice, or data from the file twice, not really.

    However, you could get the file size after you open it (but before you read it) with fstat(), then allocate a single char array to hold all of it, read it into that buffer, count the newlines in the string, allocate your array, and copy into that from the buffer.

    In fact, you don't even need to do the last part: you could instead make your array an array of pointers, set them to point into the buffer at the first char after each newline, then change all the newlines to '\0' (if you don't care about losing the newlines).
    Last edited by MK27; 04-13-2010 at 02:16 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    Registered User
    Join Date
    Apr 2010
    Posts
    20
    Yea I was planning on reading the file once to determine the number of lines before the EOF was reached, then allocating this value for the number of lines as the size of my 2 arrays, then reading the file again to do all the calculations on the data. Would this work then?

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Fixxxer View Post
    Yea I was planning on reading the file once to determine the number of lines before the EOF was reached, then allocating this value for the number of lines as the size of my 2 arrays, then reading the file again to do all the calculations on the data. Would this work then?
    Yeah. Watch for the "fencepost error" here. Example: How many fenceposts do you need for a fence 100' long with a post every ten feet?

    Eleven, of course. So make sure you account for the last line if it does not have a newline on the end (which is possible).
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    Registered User
    Join Date
    Apr 2010
    Posts
    20
    This is the bit I struggle with, would this code cycle through the data until the eof was reached?
    Code:
    for(i;i!=EOF;i++)

    and if so would you be able to tell me how to calculate the final value of i?
    Many thanks

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    for is a "counting" loop -- if you knew how many to count, you wouldn't be in this fix in the first place. It would be more natural to use "while" in this situation, IMO.

  7. #7
    Registered User
    Join Date
    Apr 2010
    Posts
    20
    ok so
    Code:
    while(i;i!=EOF;i++)

    but I still need to know a way to find out what the value of i is when eof is reached. Anyone able to help me please?

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    If you use fgets(), which is a good choice since it reads up to and including a newline by default, then you don't have to worry about EOF in your code:
    Code:
    char linebuffer[4096];
    FILE *fp = fopen(...
    int lines = 0;
    while(fgets(linebuffer,4096,fp)) lines++;
    That last line will tell you how big you need your array. fgets() will also be the best way to read into it, that could be a for loop, or you could use the same while loop (nb! you need to rewind *fp or close and reopen the file first!!!):
    Code:
    char array[lines][4096];
    int i = 0;
    while(fgets(array[i++],4096,fp));
    fgets() returns a NULL pointer at EOF. while(condition) is testing the "truth of condition; if condition has a NULL return value it is no longer true. I actually would use a for loop here, it's more certain:
    Code:
    int i;
    for (i = 0; i < lines; i++) fgets(array[i],4096,fp);
    Last edited by MK27; 04-13-2010 at 02:41 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    Registered User
    Join Date
    Apr 2010
    Posts
    20
    Quote Originally Posted by MK27 View Post
    If you use fgets(), which is a good choice since it reads up to and including a newline by default, then you don't have to worry about EOF in your code:
    Code:
    char linebuffer[4096];
    FILE *fp = fopen(...
    int lines = 0;
    while(fgets(linebuffer,4096,fp)) lines++;
    That last line will tell you how big you need your array. fgets() will also be the best way to read into it, that could be a for loop, or you could use the same while loop (nb! you need to rewind *fp or close and reopen the file first!!!):
    Code:
    char array[lines][4096];
    int i = 0;
    while(fgets(array[i++],4096,fp));
    fgets() returns a NULL pointer at EOF. while(condition) is testing the "truth of condition; if condition has a NULL return value it is no longer true. I actually would use a for loop here, it's more certain:
    Code:
    int i;
    for (i = 0; i < lines; i++) fgets(array[i],4096,fp);
    Cheers mate
    Code:
    char linebuffer[4096];
    what is this linebuffer thing though? And also what is the significance of the 4096 in the brackets?

  10. #10
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Fixxxer View Post
    what is this linebuffer thing though? And also what is the significance of the 4096 in the brackets?
    Because you do need to read into something, even if you are just throwing the data away without using it as we are here. Another purpose for a line buffer of this sort would be if you want to malloc each line in the array, but obviously you don't know how long each line is. So you could use something like:
    Code:
    char buffer[4096];
    char *array[lines];
    int i = 0;
    while(fgets(buffer,4096,fp) {
        array[i] = malloc(strlen(buffer)+1);
        strcpy(array[i++], buffer);
    }
    If you aren't using malloc yet don't worry.

    4096 is arbitrary (but it is a processor optimized size, since it is a power of 2). 4096 bytes is like half a dozen typed pages of normal latin alphabet text. So it's enough for most contexts, hopefully you will be able to recognize when and where it may not be.

    Hmmm...more fundamentally, here "X" is a positive integer value indicating the size of an array:
    Code:
    char array[X];
    Last edited by MK27; 04-13-2010 at 02:54 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  11. #11
    Registered User
    Join Date
    Apr 2010
    Posts
    20
    ah yea I've used malloc before, as previously the first bit of data in the text file was an integer giving the number of data points in the file, and I was reading this and then creating arrays of this size. However, now I have been told not to do this and instead allocate array size based on the number of lines read from the file before the eof is reached. I'll try your advice and see where I get, cheers!

  12. #12
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Quote Originally Posted by MK27 View Post
    If you use fgets(), which is a good choice since it reads up to and including a newline by default, then you don't have to worry about EOF in your code:
    Code:
    char linebuffer[4096];
    FILE *fp = fopen(...
    int lines = 0;
    while(fgets(linebuffer,4096,fp)) lines++;
    I would add
    Code:
    if (!strchr(buffer, '\n')) {
        printf("No end-of-line detected. Too long for buffer.\n");
        ...; }
    ... just in case the 4096 length was not enough to contain a single line of text.

  13. #13
    Registered User
    Join Date
    Apr 2010
    Posts
    20
    Cheers for all your help guys, I've now managed to get my programme to successfully count the number of lines in my data file
    However, I am not sure whether I am incorrectly using malloc or something as when I come to do the math on the data, which was previously faultless, I now get answers of 0 for everything.
    Code:
    /* Project B */
    #include <stdio.h>
    #include <math.h>
    #include <stdlib.h>
    
    
    int main() {
    	FILE * pinfile;
      int i,n;
      double *x,*y, SUMx, SUMy, SUMxy, SUMxx;
      float a,b,s,p,errora,errorb;
    
      pinfile = fopen("xydata.txt", "r");
      if (pinfile==0) {
    	  printf("error opening file\n");
      return 0;
      }
      char linebuffer[4096];
      int lines = 0;
      while(fgets(linebuffer,4096,pinfile)) lines++;
      n=(lines - 2);
      x = (double *) malloc ((lines - 2)*sizeof(double));
      y = (double *) malloc ((lines - 2)*sizeof(double));
      if(x==NULL){
    	  printf("\n Error on malloc\n");
    	  return 1;
    	}
    	if(y==NULL){
    	  printf("\n Error on malloc\n");
    	  return 1;
    	}
    
    
        SUMx = 0; SUMy = 0; SUMxy = 0; SUMxx = 0;
        for (i=0; i<n; i++) {
        fscanf (pinfile, "%lf %lf", &x[i], &y[i]);
        SUMx = SUMx + (x[i]/exp(2));
        SUMy = SUMy + (y[i]/exp(2));
        SUMxy = SUMxy + ((x[i]*y[i])/exp(2));
        SUMxx = SUMxx + ((x[i]*x[i])/exp(2));
        p=(lines - 2)*(1/exp(2));
    	a=(((SUMy*SUMxx)-(SUMx*SUMxy))/((p*SUMxx)-(SUMx*SUMx)));
    	b=(((p*SUMxy)-(SUMx*SUMy))/((p*SUMxx)-(SUMx*SUMx)));
    	s=(((a+(b*x[i])-y[i])*(a+(b*x[i])-y[i])/exp(1)));
    	errora=sqrt(s/((p*SUMxx)-(SUMx*SUMx)));
    	errorb=sqrt(p/((p*SUMxx)-(SUMx*SUMx)));
    	}
        printf("n has the value %d \n",n);
    	printf("S has the value %f \n",s);
    	printf("The standard deviation of the estimate made of parameter 'A' is %f \n",errora);
    	printf("Additionally the standard deviation of the estimate made of parameter 'B' is %f \n",errorb);
      return 0;
    }
    I can't see where it is going wrong, and it complies ok, but when executed it returns...
    n has the value of 7
    S has the value 0
    The standard deviation of the estimate made of parameter 'A' is 0
    Additionally the standard deviation of the estimate made of parameter 'B' is nan
    clearly as it gives a value of n as 7 it is counting the lines correctly, so I don't understand why I no longer get correct numerical values for the other parameters. I have changed my data incase this was some problem with that but alas no improvement!

  14. #14
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by MK27 View Post
    Yeah. Watch for the "fencepost error" here. Example: How many fenceposts do you need for a fence 100' long with a post every ten feet?

    Eleven, of course. So make sure you account for the last line if it does not have a newline on the end (which is possible).
    Does it have to be a straight line?
    Code:
    o-o-o-o-o
    |        |
    o-o-o-o-o

    Quzah.
    Hope is the first step on the road to disappointment.

  15. #15
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    You need to rewind() the file so that you can go back to the beginning and read the data -- right now, every single one of your reads fails.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. while ((c = getchar()) != EOF), AND cntrl z
    By Roger in forum C Programming
    Replies: 8
    Last Post: 10-21-2009, 09:25 PM
  2. EOF Explanation Anybody?
    By blackcell in forum C Programming
    Replies: 1
    Last Post: 01-29-2008, 09:09 PM
  3. EOF or not EOF?
    By CornedBee in forum Linux Programming
    Replies: 2
    Last Post: 09-14-2007, 02:25 PM
  4. whats the deal with EOF really ???
    By gemini_shooter in forum C Programming
    Replies: 7
    Last Post: 03-06-2005, 04:04 PM
  5. files won't stop being read!!!
    By jverkoey in forum C++ Programming
    Replies: 15
    Last Post: 04-10-2003, 05:28 AM