Thread: Help with while loops, EOF. (Have read FAQ)

  1. #1
    Registered User
    Join Date
    Apr 2010
    Posts
    88

    Help with while loops, EOF. (Have read FAQ)

    Hi there. I've been programming in C for a few months, and I've hit a snag with EOF and fgetc. I read Hammer's FAQ article about why it's bad to use feof() in a while loop. I tried it, and like he said the last line of output was repeated. Now that I've adjusted my while loop to use a value determined by fgetc, I'm still having the same problem.

    My program is meant to print off the first word of each line of a given .txt document. I program in Ubuntu, and I'm using a the redirection operator "<" to redirect my text file into the stdin stream of my program.

    My code is short:

    Code:
    int main (int argc, char * argv[] ){
    
      char temp[20];
      char holder, holder2;
    
      fscanf(stdin, "%s", temp);
      printf("%s\n", temp);
      holder = fgetc(stdin);
    
      while(holder != EOF){
        if(holder == '\n') {
          fscanf(stdin, "%s", temp);
          printf("%s\n", temp);
        }
        holder = fgetc(stdin);  
      }
    
      return 1;
    }
    My text file is simply:


    Here is some text
    arranged in a strange way
    on
    each line
    END.


    The output I receive is:

    Here
    arranged
    on
    each
    END.
    END.

    I understand this may be trivial to many of you. I'm not so much looking for code, but more so a clarification of how the EOF is handled; I'd like to figure out which subtlety has been driving me bonkers for the last two hours.

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Presumably you have \n after "END." in your file, so that after that word is scanned and printed, your fgetc gets that \n, the loop continues, and the fscanf fails since that \n was the last bit in the file.

  3. #3
    Registered User
    Join Date
    Apr 2010
    Posts
    88
    I have tried making sure that there is no extra '\n' after the "END." to no avail.

    Is it possible that a newline is inserted at the end of every file in linux? If so, what can I do to ignore the final \n?

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ocifer View Post
    Is it possible that a newline is inserted at the end of every file in linux?
    No. Try inserting this to see what that character is:
    Code:
      while(holder != EOF){
        if(holder == '\n') {
          fscanf(stdin, "%s", temp);
          printf("%s\n", temp);
        }
        holder = fgetc(stdin);  
        printf("->%d"<-",holder);
      }
    This will give you the ascii value of the holder, I'm guessing it will be 0.

    Qv. the ascii table if you are unaware of it.
    ASCII Table / Extended ASCII Codes
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    Registered User
    Join Date
    Apr 2010
    Posts
    88
    I'm sorry but the code you provided did not compile, I received a number of error messages. What do the -><- do?

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    He's got an extra quote mark in there, which you should remove.

    (What it will do is give you output like so:
    Code:
    $ ./ociter < txtfile
    Here
    ->105<-->115<-->32<-->115<-->111<-->109<-->101<-->32<-->116<-->101<-->120<-->116<-->10<-arranged
    ->32<-->105<-->110<-->32<-->97<-->32<-->115<-->116<-->114<-->97<-->110<-->103<-->101<-->32<-->119<-->97<-->121<-->10<-on
    ->10<-each
    ->32<-->108<-->105<-->110<-->101<-->10<-END.
    ->10<-END.
    ->-1<-
    So that 10 (==\n) is still there, somehow.)

    EDIT: The workaround is obviously to check the return value of fscanf.
    Last edited by tabstop; 04-11-2010 at 03:50 PM.

  7. #7
    Registered User
    Join Date
    Apr 2010
    Posts
    88
    Upon removing the extra quotation mark:

    The last two printed values for the variable "holder" were ->10<- , corresponding to line feed.

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ocifer View Post
    I'm sorry but the code you provided did not compile, I received a number of error messages. What do the -><- do?
    Sorry, whoops, there was an extra " in there (too much glue sniffing again today ):
    Code:
    printf("->%d"<-",holder);
    You may want to combine this with tabstop's point:
    Code:
      int r;
      while(holder != EOF){
        if(holder == '\n') {
          r = fscanf(stdin, "%s", temp);
          printf("%s\n", temp);
        }
        holder = fgetc(stdin);  
        printf("->r: %d h: %d<-", r, holder);
      }
    Last edited by MK27; 04-11-2010 at 03:58 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ocifer View Post
    Upon removing the extra quotation mark:
    The last two printed values for the variable "holder" were ->10<- , corresponding to line feed.
    Hmmm. Could be the shell does this with redirection but I doubt it, will check...
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    Registered User
    Join Date
    Apr 2010
    Posts
    88
    Okay, I've managed to write a workaround, using a "check" integer variable for fscanf. I've noticed that the last time fscanf tries to scan information from the text file, the value of the check digit is -1. I'm assuming this denotes an error, or EOF or something similar.

    My revised code looks like this and it works as desired:

    Code:
    int main (int argc, char * argv[] ){
    
      int check;
      char temp[20];
      char holder;
    
      fscanf(stdin, "%s", temp);
      printf("%s\n", temp);
      holder = fgetc(stdin);
    
      while(holder != EOF){
        if(holder == '\n') {
          check = fscanf(stdin, "%s", temp);
          if(check <= 0)
            break;
          printf("%s\n", temp);
        }
        holder = fgetc(stdin);  
      }
    
      return 1;
    }
    Now that my program works, I'm happy. However, I'd still like to figure out exactly what goes on with the end of a file. I can imagine that as I come to write longer programs, which may be using much longer files, that workarounds like this can become trickier.

    If anyone can explain why LF character appears at the end? I have not transferred the file between operating systems like Windows and Linux. The .txt file was created and used solely in Ubuntu.

  11. #11
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Ocifer View Post
    However, I'd still like to figure out exactly what goes on with the end of a file.
    No, the problem is you did not get rid of the final newline. How do you know it isn't there? Create a one character text file in the editor you are using, with no line return, and check the file size. Betya it's 2, not 1.

    The easiest way to demonstrate is to create a 1 byte test file:
    Code:
    #include <stdio.h>
    #include <fcntl.h>
    
    int main(void) {
    	int fd = open("test.file",O_CREAT|O_WRONLY);
    	char b[] = "A";
    	write(fd,b,1);
    	close(fd);
    
    	return 0;
    }
    Check the file size. It should be 1. Now compile this:

    Code:
    #include <stdio.h>
    
    int main(void) {
    	char byte;
    	while (read(0,&byte,1)) printf("%d ",byte);
    
    	return 0;
    }
    and try:
    ./a.out < test.file

    I get "65" and that's it. No newline.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  12. #12
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Try the following code:
    Code:
    #include <stdio.h>
    
    int main(void) {
        char string[] = "Here is some text\narranged in a strange way\non\neach line\nEND.";
        FILE *out;
        out = fopen("txtfile", "w");
        fprintf(out, "%s", string);
        fclose(out);
        return 0;
    }
    This will definitely write out a file with no \n at the end. Try that with your original program and good things might happen. (gedit is apparently known to add \n at the end of every file; it appears that we can add vim to that list, at least with default settings, since that's what I was using.) (Apparently if I had typed :help eol in vim I would have seen that.)
    Last edited by tabstop; 04-11-2010 at 04:29 PM.

  13. #13
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by tabstop View Post
    it appears that we can add vim to that list, at least with default settings, since that's what I was using.)
    Yeah! I was certain that was not the case but just noticed the same thing, vim adds a \n.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  14. #14
    Registered User
    Join Date
    Apr 2010
    Posts
    88
    I was using gedit, so that makes sense.

  15. #15
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    fgetc returns an int, not a char.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. while ((c = getchar()) != EOF), AND cntrl z
    By Roger in forum C Programming
    Replies: 8
    Last Post: 10-21-2009, 09:25 PM
  2. Replies: 2
    Last Post: 08-14-2009, 01:28 AM
  3. EOF messing up my input stream?
    By Decrypt in forum C++ Programming
    Replies: 4
    Last Post: 09-30-2005, 03:00 PM
  4. files/arrays
    By ssjnamek in forum C++ Programming
    Replies: 14
    Last Post: 09-19-2005, 08:39 PM
  5. cin strings till eof
    By bfedorov11 in forum C++ Programming
    Replies: 2
    Last Post: 10-15-2003, 07:27 AM