Thread: Reading a file line by line

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    16

    Reading a file line by line

    I'm having some trouble with reading a file line by line.

    Code:
    char *error[100];
    int num[256], x = 0;
    FILE *openme;
    
    openme = fopen("test", "r");
    
    while(1)
    {
    	error[x] = malloc(sizeof(char)*59);
    	if ((fgets(error[x], 58, openme)) == NULL)
                    break;
    
            x++;
    }
    All this does is take in 58 characters from the file, and throw it into a char array. Now, I'm faced with a problem here. How do I get the contents of the file one line at a time? (Where a line is any new "\n").
    Consider this example string,

    hi i am a string of text la la la la la la la la la la la la la la\n

    The code I provided will only get the first 58 characters. How can I capture the entire line without exactly knowing how long or short it is?

  2. #2
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    How can I capture the entire line without exactly knowing how long or short it is?
    You can't; at least, not directly. You've got to do some work yourself.

    The basic idea is to read a lot, and see if there's a newline or EOF. If so, great, you're done. If not, resize your buffer, then repeat this process. There are a number of implementations of functions that do this. GNU has getline(), and I would recommend looking at Richard Heathfield's page on reading lines. He has his own implementation as well as links to others, in addition to a detailed description of the whole problem of line reading in C.

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Raskalnikov View Post
    How can I capture the entire line without exactly knowing how long or short it is?
    Use a large buffer and then check the length of the string returned by fgets. If it is shorter than the buffer, then you have a whole line. If not, then you have more to read, and the next fgets() will pick up where the last one left off.

    How you should aquire and process the line really depends on what you want to do with it. If you want every pointer in *error[100] to be a complete line, you could do something like this:
    Code:
    int len, i=0;
    char buffer[4096], *error[100];
    while ((fgets(buffer,64,64,openme)) {
         error[i]=malloc(strlen(buffer)+1);
         strcpy(error[i],buffer);
         i++;
    }
    This doesn't account for cases where the line could be longer than that; if that is a possibility you will have to write a slightly more complicated loop possibly with realloc().
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    Registered User
    Join Date
    Jul 2005
    Posts
    21
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main( void ) {
        char string[30];
        
        FILE *fileptr = fopen("example.txt","r");
        
        while( feof(fileptr) != EOF ) {
            if( fgets(string, 30, fileptr) == NULL ) {
                return 0;
            }
            printf("%s", string);
        }
        
        return 0;
    }
    Example file:
    Code:
    hi there my name is sally and i'm super duper! I
    really enjoy macaroni.
    Output:
    Code:
    hi there my name is sally and i'm super duper! I
    really enjoy macaroni.
    Sorry for the edit, if you'd only like to get the line you could do something like this within the loop.
    Code:
    if( string[strlen(string)] == '\0' ) {
    	printf(" Newline\n");
    }
    Last edited by saeculum; 03-18-2009 at 06:32 PM.

  5. #5
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    Code:
    while( feof(fileptr) != EOF ) {
        if( fgets(string, 30, fileptr) == NULL ) {
            return 0;
        }
        printf("%s", string);
    }
    feof() does not return EOF; or, at least, it is specificed as returning zero or non-zero, so it makes no sense to compare against EOF. In addition, using feof() to control a loop is usually wrong. The following loop can't be used as a general drop-in for the above loop, but the above loop's "return 0" is a pretty special case that I think, generally, would be replaced with "break", so as an improved loop:
    Code:
    while(fgets(string, sizeof string, fileptr) != NULL)
    {
      printf("%s", string);
    }
    Next:
    Code:
    if( string[strlen(string)] == '\0' ) {
            printf(" Newline\n");
    }
    string[strlen(string)] is always 0 (which is the same as '\0'), or undefined. The above test is therefore not useful. strlen() gives the length of a string, minus its terminating null character (which all strings must have). So:
    Code:
    char s[] = "hi";
    size_t n = strlen(s); /* This is 2 */
    /* s[0] is 'h', s[1] is 'i', and s[2] is 0 */

  6. #6
    Registered User
    Join Date
    Jul 2005
    Posts
    21
    Quote Originally Posted by cas
    feof() does not return EOF; or, at least, it is specificed as returning zero or non-zero, so it makes no sense to compare against EOF
    EOF is defined as -1 (in stdio.h), which is nonzero and aids in the reading of the code.

    Quote Originally Posted by cas
    string[strlen(string)] is always 0 (which is the same as '\0'), or undefined.
    That's true, apologies it was added in a rush. The use of feof is fine there, though removing it and following your example is more efficient. The loop looking like this now:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int main( void ) {
    	short maxSize = 30;
    	char string[maxSize];
    	
    	FILE *fileptr = fopen("example.txt","r");
    	
    	while( fgets(string, maxSize, fileptr) != NULL ) {
    		printf("%s", string);
    		if( strlen(string) != maxSize - 1 ) {
    			printf(" Newline\n");
    		}
    		
    	}
    	
    	return 0;
    }
    Note that the size of the string array is set in a variable, the printf when the end of the string is detected isn't required, and blank lines, and the end-of-file also meet the conditions of the newline if statement.

    Apologies to Raskalnikov for the mess.
    Last edited by saeculum; 03-18-2009 at 07:43 PM.

  7. #7
    Registered User
    Join Date
    Sep 2007
    Posts
    1,012
    EOF is defined as -1 (in stdio.h), which is nonzero and aids in the reading of the code.
    Well, EOF can actually be any negative int. But that's neither here nor there. The problem is that these two expressions are not the same:
    Code:
    !feof(fp);
    feof(fp) != EOF;
    The first yields 1 (i.e. true) if fp's end-of-file indicator is not set, and 0 (false) if it is.
    The second compares the return value of feof() to the macro EOF, which, for argument's sake, we'll call -1 (which is a common value for it).

    Here's the problem: feof() is defined to return non-zero on end-of-file. That is, any non-zero value is fine. So feof() might be implemented like:
    Code:
    #define __sfeof(p)      (((p)->_flags & __SEOF) != 0)
    int
    feof(FILE *fp)
    {
            int     ret;
    
            FLOCKFILE(fp);
            ret= __sfeof(fp);
            FUNLOCKFILE(fp);
            return (ret);
    }
    This is how FreeBSD implements feof(). Notice how it returns a value that was calculated with the != operator. The != operator yields either 1 or 0, so as you can see, FreeBSD's feof() returns either 1 or 0, which is perfectly conforming.

    Now see the problem with comparing against EOF? feof() will never equal EOF! A loop that tests feof(fp) != EOF will never terminate with this implementation of feof(). feof() returns a boolean value and should be treated as such.

  8. #8
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    moreover feof should not be used at all to control loop - FAQ describes why
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  9. #9
    Resu Deretsiger Nightowl's Avatar
    Join Date
    Nov 2008
    Location
    /dev/null
    Posts
    186
    To expand on that eloquent post . . . here and here.
    Do as I say, not as I do . . .

    Experimentation is the essence of programming. Just remember to make a backup first.

    "I'm a firm believer that <SomeGod> gave us two ears and one mouth for a reason - we are supposed to listen, twice as much as we talk." - LEAF

    Questions posted by these guidelines are more likely to be answered.

    Debian GNU/Linux user, with the awesome window manager, the git version control system, and the cmake buildsystem generator.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. A development process
    By Noir in forum C Programming
    Replies: 37
    Last Post: 07-10-2011, 10:39 PM
  2. reading words line by line from a file
    By -EquinoX- in forum C Programming
    Replies: 3
    Last Post: 05-04-2008, 12:34 AM
  3. Reading random line from a text file
    By helloamuro in forum C Programming
    Replies: 24
    Last Post: 05-03-2008, 10:57 PM
  4. help with text input
    By Alphawaves in forum C Programming
    Replies: 8
    Last Post: 04-08-2007, 04:54 PM
  5. what does this mean to you?
    By pkananen in forum C++ Programming
    Replies: 8
    Last Post: 02-04-2002, 03:58 PM