Thread: Never find EOF after improperly formatted input

  1. #1
    Registered User
    Join Date
    Sep 2008
    Posts
    3

    Never find EOF after improperly formatted input

    Here's a super-basic C question (this seemed like the most apt place to ask, sorry if there's a better spot). I've got this code, which is derived largely from the fscanf reference at cplusplus.com. All it does is read a file called coords.txt, which should contain a listing of coordinate pairs. Then it just says how many are there.

    Code:
    /* fscanf example */
    #include <stdio.h>
    
    int main ()
    {
      float xtemp, ytemp;
      FILE *pFile;
      int n;
    
      // Open the file with the coordinates and check that it opened OK.
      pFile = fopen("coords.txt","r");
      if(pFile == NULL)
      {
        printf("The file coords.txt does not exist.\n");
        printf("Please create this file before running the program again.\n");
    
        return 1;
      }
    
      // Now read formatted pairs until you hit the end of file
      n = 0;
      while(fscanf(pFile, "%f %f", &xtemp, &ytemp) != EOF)
      {
        n++;
        printf("Line %i \n", n);
      }
    
      printf("There are %i coordinate pairs.\n", n);
    
      return 0;
    }
    So, if you give it a coords.txt file that looks like this:
    Code:
    4.234 -23.43
    0.111 0.222
    -3 4
    Everything's fine. The output is as expected, and the program exits normally:

    Code:
    Line 1 
    Line 2 
    Line 3 
    There are 3 coordinate pairs.
    Now, if I remember correctly, in Fortran, you can read from a file in a similar way, and it doesn't care if the values are separated by a newline, a space, multiple spaces, a tab, or a comma. So, if you add a comma to coords.txt so it reads

    Code:
    4.234 -23.43
    0.111 0.222
    -3, 4
    Now the program freaks out when you run it. It never stops, it just keeps outputting:

    Code:
    Line 1 
    Line 2 
    Line 3 
    Line 4
    Line 5
    and so on until you kill it.

    So obviously this doesn't work like Fortran. I understand that, but I have a couple questions. First, my only explanation is that if fscanf fails to read properly formatted data, it never proceeds further through the file, and thus never hits the EOF. Is that correct? If not, why else would this happen? Beyond that, is there a simple way to make it tolerate comma-separation in the data file? It's not critical to what I'm doing, but I like to know these things.

    I'm trying to keep the code as simple as possible, so I didn't want to use filestreams or anything. I'd like to stick to the most basic functions here... This isn't meant to be great code, it's intended for learning this stuff.

    Thanks!

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    The last "f" in "fscanf" stands for formatted -- if you don't obey the format (or, I suppose, the format doesn't obey the file) then you should expect general bad things. And yes, fscanf is designed to get stuck -- if something doesn't match the format, it keeps the bad input for next time.

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > while(fscanf(pFile, "&#37;f %f", &xtemp, &ytemp) != EOF)
    Check for unique success, not one of the several possible failure results (EOF, 0 or 1 being possible non-answers for this call).

    Eg.
    while(fscanf(pFile, "%f %f", &xtemp, &ytemp) == 2)

    Better yet, decouple input from conversion with
    Code:
    char buff[BUFSIZ];
    while ( fgets( buff, sizeof buff, pFile ) != NULL ) {
        if ( sscanf(buff, "%f %f", &xtemp, &ytemp) == 2 ) {
        } else
        if ( sscanf(buff, "%f, %f", &xtemp, &ytemp) == 2 ) { /* try the comma variant */
        } else {
            // maybe it's just rubbish
        }
    }
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Sep 2008
    Posts
    3
    OK, so then it never hits the EOF because it never moves beyond the line that doesn't obey the format. It just stays on that line, keeps returning a 1, and loops forever.

    Now if I want to handle that error, a new problem crops up. Here, I save the return value of fscanf, and on the successful runs where the coords.txt file is valid, for some reason fscanf STILL only returns a 1 (rather than a 2), even if it reads both values successfully. So the program works as expected, but I can't tell the user there's a problem with the input file because I can't use "if(scanout!=2)". For example, the new code:

    Code:
    #include <stdio.h>
    
    int main ()
    {
      float xtemp, ytemp;
      FILE *pFile;
      int scanout;
      int n;
    
      // Open the file with the coordinates and check that it opened OK.
      pFile = fopen("coords.txt","r");
      if(pFile == NULL)
      {
        printf("The file coords.txt does not exist.\n");
        printf("Please create this file before running the program again.\n");
    
        return 1;
      }
    
      // Now read formatted pairs until you hit the end of file
      n = 0;
      while(scanout=fscanf(pFile, "%f %f", &xtemp, &ytemp) != EOF)
      {
    
    // Handle a bad file
    //    if(scanout != 2)
    //    {
    //      printf("Invalid coordinate file. Please check format.\n");
    //      printf("No commas, please...\n");
    //      return 1;
    //    }
    
        n++;   
        printf("Line %i: (%f,%f)...%i\n", n, xtemp, ytemp, scanout);
      }
    
      printf("There are %i coordinate pairs.\n", n);
      return 0;
    }
    This executes as follows (for the valid non-comma input file):
    Code:
    Line 1: (4.234000,-23.430000)...1
    Line 2: (0.111000,0.222000)...1
    Line 3: (-3.000000,4.000000)...1
    There are 3 coordinate pairs.
    Now it reads all the values correctly, but fscanf still only returns a 1. If I uncomment the error-handling code, it will print that error message every time, even for the same coords.txt file. Why would this happen?

    Thanks again!

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Because you are not assigning scanf(whatever) to scanout, but the value of scanf(whatever)!=EOF. Since that is true, scanout is assigned true (1). You may want to surround the assignment with parentheses.

  6. #6
    Registered User
    Join Date
    Sep 2008
    Posts
    3
    Ugh. Unbelievable. I should've seen that one.

    Alright, everything makes sense now. Thanks everyone!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. For loop problems, input please.
    By xIcyx in forum C Programming
    Replies: 2
    Last Post: 04-22-2007, 03:54 AM
  2. I would love some input on my BST tree.
    By StevenGarcia in forum C++ Programming
    Replies: 4
    Last Post: 01-15-2007, 01:22 AM
  3. About aes
    By gumit in forum C Programming
    Replies: 13
    Last Post: 10-24-2006, 03:42 PM
  4. files won't stop being read!!!
    By jverkoey in forum C++ Programming
    Replies: 15
    Last Post: 04-10-2003, 05:28 AM
  5. Couple C questions :)
    By Divx in forum C Programming
    Replies: 5
    Last Post: 01-28-2003, 01:10 AM