Thread: scanf(), whitespace, and EOF

  1. #1
    Registered User
    Join Date
    May 2008
    Posts
    87

    scanf(), whitespace, and EOF

    Hi,

    I'm trying to parse whitespace-delimited doubles out of an input stream. I'm having trouble properly detecting the end of the file. Here is a small program which reproduces this:

    Code:
    #include <stdio.h>
    
    int main(int argc, char *argv[]) {
        double d;
        double sum;
        int n;
    
        sum = 0;
        do {
            n = scanf(" %lf", &d);
            if (n == 1) sum += d;
        } while (n != EOF);
    
        if (!feof(stdin))
            perror(argv[0]);
    
        printf("%lf\n", sum);
        return 0;
    }
    This program has trouble detecting the end of the file when there is a trailing space after the last number. If I am entering numbers in from the terminal, I have to press ctrl-d to send EOF twice. I have another program that shells out to this one (not shown), and even after closing the stream on the parent process's end, the scanf() above seems to block forever after reading the last number out of the input. Interestingly, the program has no trouble with trailing whitespace if the input is redirected or piped in from the shell.

    So, this fails (or requires two ctrl-d's)
    $ ./test
    1 2 3SPACE

    But this does not:
    $ echo "1 2 3 " | ./test

    Nor does this:
    $ echo "1 2 3 " > numbers
    $ ./test < numbers

    I think if can fix it for the first case, the problem when calling the program from another program and sending the input over a pipe will resolve itself.

    How can I better detect EOF?

    I'm on an ubuntu 10.10 machine, gcc 4.4.5

    Thanks,
    Jason

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    There is nothing wrong with your program or the handling of EOF

    > I have to press ctrl-d to send EOF twice
    Well the first crtl-d acts like \n (except that no character is put into the input stream). The effect in question is that the currently buffered characters (1 2 3) are sent to the program.

    Having sent all the characters, there are no characters in the driver buffer when the second ctrl-d is pressed, so it sends EOF through to your program.

    Your echo/redirection tests show it is working normally.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    May 2008
    Posts
    87
    I see. As my first post alluded to, ultimately I would like to call this program from another, send it input, and read its output. Tonight after work, I'll write up and post a little test calling program to see if that is where I am going wrong.

  4. #4
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by jason_m View Post
    I see. As my first post alluded to, ultimately I would like to call this program from another, send it input, and read its output. Tonight after work, I'll write up and post a little test calling program to see if that is where I am going wrong.
    If that's your goal and you are not sending more than 256 character command lines why not use the arc/arv construct in main to pass the data right on your command line... no more scanf headaches...

    If you continue with redirected input, try using ...
    Code:
        double d;
        double sum = 0;
        while ( scanf(" %lf", &d) > 0 );
            sum += d;
    Last edited by CommonTater; 09-12-2011 at 10:00 AM.

  5. #5
    Registered User
    Join Date
    May 2008
    Posts
    87
    Well, putting together a stripped down example of a parent process passing data to my program lead me to the problem/solution.

    For completeness, here's the other program:
    Code:
    #include <unistd.h>
    #include <stdlib.h>
    #include <stdio.h>
    
    int main() {
        int readParent[2];
        int writeParent[2];
    
        pipe(readParent);
        pipe(writeParent);
    
        pid_t pid = fork();
    
        if (pid == 0) {
            // This is the child
            close(writeParent[1]);  // <-- Without this, the program blocks for more input
            dup2(writeParent[0],0);
            close(readParent[0]);
            dup2(readParent[1],1);
            execl("test", "test", NULL);
        } else {
            // This is the parent
    
            FILE *in = fdopen(readParent[0], "r");
            FILE *out = fdopen(writeParent[1], "w");
    
            double nums[] = {1, 2, 3};
    
            int i;
            for (i = 0; i < 3; i++)
                fprintf(out, "%lf ", nums[i]);
    
            fclose(out);
    
            double d;
            int n = fscanf(in, "%lf", &d);
    
            if (n < 1) {
                printf("Something is broken\n");
                exit(EXIT_FAILURE);
            }
    
            printf("Parent recieved: %lf\n", d);
        }
    
        exit(EXIT_SUCCESS);
    }
    This works as intended. What I didn't have before were the lines closing the other end of the pipes in the child. In particular, without the statement close(writeParent[1]), the first program will block, waiting for more input after the last item has been sent.

    Here's my understanding - let me know if I got this right. The pipe writeParent[] has a read end (0) and a write end (1). When I call fork(), both the parent and child process have their own handle (file descriptor) to both ends of the pipe. While the parent closed their write end, the client still had theirs open. Thus, EOF was not hit since the pipe was still open. Is that about right?

    So the problem was in my calling program, and I misdiagnosed it as being related to how I was having to send EOF twice from the command line.
    Last edited by jason_m; 09-12-2011 at 06:03 PM.

  6. #6
    Registered User
    Join Date
    May 2008
    Posts
    87
    Quote Originally Posted by CommonTater View Post
    If that's your goal and you are not sending more than 256 character command lines why not use the arc/arv construct in main to pass the data right on your command line... no more scanf headaches...

    If you continue with redirected input, try using ...
    Code:
        double d;
        double sum = 0;
        while ( scanf(" %lf", &d) > 0 );
            sum += d;
    Your code is more compact, but there is a reason my loop is laid out how it is. The input may come in multiple formats. The input items are actually cash flows. Optionally, a time may also be specified with a cash flow. Optionally, a spot rate for calculating present values may by associated with a cash flow. So the program may have up to 3 reads for each cash flow, depending on which options are set. The actual loop in the full program looks more like this:
    Code:
    do {
      if (time_flag == 1)
        n = scanf(" %lf", &time);
      else
        ++time;
    
      n = scanf(" %lf", &cashflow);
    
      if (spot_flag == 1)
        n = scanf(" %lf", &rate);
    
      ...
    } while (n == 1);

  7. #7
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by jason_m View Post
    Your code is more compact, but there is a reason my loop is laid out how it is. The input may come in multiple formats. The input items are actually cash flows. Optionally, a time may also be specified with a cash flow. Optionally, a spot rate for calculating present values may by associated with a cash flow. So the program may have up to 3 reads for each cash flow, depending on which options are set. The actual loop in the full program looks more like this:
    Code:
    do {
      if (time_flag == 1)
        n = scanf(" %lf", &time);
      else
        ++time;
    
      n = scanf(" %lf", &cashflow);
    
      if (spot_flag == 1)
        n = scanf(" %lf", &rate);
    
      ...
    } while (n == 1);
    Another poossibility is that you could take your inputs and stuff them into a common buffer with a single scanf and use the flag to assign them out to their respective variables once the input processing is finished... It might be a little faster but it will certainly simplify your input function.

  8. #8
    Registered User
    Join Date
    May 2008
    Posts
    87
    Quote Originally Posted by CommonTater View Post
    Another poossibility is that you could take your inputs and stuff them into a common buffer with a single scanf and use the flag to assign them out to their respective variables once the input processing is finished... It might be a little faster but it will certainly simplify your input function.
    I'm not sure I picture how this would work. To provide a little more information on the input, the number of cash flows is variable. On average, there will probably be around 100. Everything comes on the same input stream. Command line switches specify if the data should be interpreted as having either of the optional pieces of information associated with each cash flow.

  9. #9
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Ok... I see what you're doing now... so yes, yours is probably the best way. (Just trying to be helpful)

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > close(writeParent[1]); // <-- Without this, the program blocks for more input
    Regardless of anything else you're doing, you've got to make sure your pipe handling is correct, otherwise it will behave as you observe.

    Use a simple echo program as the child to test with
    Code:
    int main ( ) {
      int ch;
      while ( (ch=getchar()) != EOF ) putchar(ch);
      return 0;
    }
    > Here's my understanding - let me know if I got this right...
    Yes, that is correct. Both the parent and child need to clean up the ends of the pipes they're not interested in.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Question about scanf and spaces or whitespace
    By jensbodal in forum C Programming
    Replies: 4
    Last Post: 11-26-2009, 04:12 PM
  2. stop scanf consming whitespace
    By mitchb in forum C Programming
    Replies: 2
    Last Post: 03-22-2009, 05:17 AM
  3. whitespace
    By 182 in forum C++ Programming
    Replies: 27
    Last Post: 02-18-2006, 09:50 PM
  4. Whitespace and scanf()
    By Procyon in forum C Programming
    Replies: 1
    Last Post: 01-05-2002, 01:55 AM
  5. Removing whitespace
    By Unregistered in forum C Programming
    Replies: 13
    Last Post: 12-31-2001, 08:17 AM