Thread: EOF help

  1. #1
    Registered User
    Join Date
    Oct 2012
    Posts
    99

    EOF help

    Why can't I hit ctrl-Z just once to exit from my while loop?

    The first time I hit ctrl-Z, my debugger says it has a value of ASCII 32, or a blank space.

    If I hit ctrl-Z twice in a row, then it will read it as EOF.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <ctype.h>
    
    int main()
    {
        char ch;
        char prev_ch='a';
        char first_ch;
        float word_counter=0;
        float letter_counter=0;
        float wpl_counter;
    
        printf("Enter words and then type ctrl Z when done\n");
    
        do
        {
        ch=getchar();
        first_ch=ch;  //checks to see if first ch is a blank space for first and last blank space = a word overcount
    
            if(isspace(ch))
            {
                if(isspace(prev_ch)&&isspace(ch))
                {
                    continue;  // doesn't count two spaces in a row as a new word
                }
                word_counter++;  //counts total # of words
                prev_ch=ch;  // stores previous ch to check for two blank spaces in a row
                continue;         //skips counting whitespace as number of letters calculated below
            }
            else if(ispunct(ch))
            {
                continue;  //skips counting puncutation as number of letters calculated below
            }
            else  //ch must be a letter
            {
            letter_counter++;  //adds up total # of letters
            }
    
        }while(ch!=EOF);
    
        if(isspace(first_ch)&&isspace(ch))
        {
            word_counter--;  //corrects for extra word count if the first and last ch are both blank spaces
        }
        else if(!(isspace(first_ch)&&isspace(ch)))
        {
            word_counter++;  //corrects for lack of counting a word if no space at start or end of input
        }
    
    letter_counter--;  //corrects for cntrl-z entered
    wpl_counter=letter_counter/word_counter;
    
    printf("Total number of words is %.0f\n", word_counter);
    printf("Total number of letters is %.0f\n", letter_counter);
    printf("Total number of words per letter is %.2f\n", wpl_counter);
    
    return EXIT_SUCCESS;
    }
    Thanks in advance.

  2. #2
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    It's a feature of your operating system and device drivers, and how they interact with functions in the standard library supplied with your compiler. No way to change it, short of rewriting relevant code in your operating system, rewriting your standard library and parts of the compiler, or using other techniques that go outside constraints of the C standard.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  3. #3
    Ultraviolence Connoisseur
    Join Date
    Mar 2004
    Posts
    555
    Other things to consider: EOF is actually bigger then 'char' and needs to be in an unsigned int (which is the type you should use with getchar()).

  4. #4
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by nonpuz View Post
    Other things to consider: EOF is actually bigger then 'char' and needs to be in an unsigned int (which is the type you should use with getchar()).
    getchar() returns int, not unsigned int. EOF is outside the range of values that can be represented in a char, but able to be represented by an int.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  5. #5
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    If I hit Ctrl-Z, the job gets stopped. Perhaps you should have asked this in the Windows programming board?

    I know in Linux the equivalent Ctrl-D must be repeated unless at the start of a line, because the first one will cause the terminal to flush the line without a trailing newline -- and therefore completely discard the initial Ctrl-D --, and only the second will end the input.


    Your code logic seems overly complicated to me. Why not use a very simple finite-state machine instead? Pseudocode:
    Code:
    words = 0
    chars = 0
    curr = next character from input
    
    loop:
    
        # First state: Between words. Consume white-space.
        while curr is white-space:
            chars = chars + 1
            curr = next character from input
        end while
    
        # No more input?
        if curr indicates end-of-file or error,
            break out of the loop.
    
        # Transition to second state: we have a new word.
        words = words + 1
    
        # Second state: Consume word.
        while curr is not white-space:
            chars = chars + 1
            curr = next character from input
        end while
    
    end loop
    Note that if the input is line-buffered, the loop ends up being one full line behind the actual input.

    When the code has consumed the line, it encounters the newline (white-space). Because it must see the first non-white-space character (or end of input) to get out of the inner while loop, it will wait until either the next line is input, or an end of input.

    This also happens if you use say fscanf() with a trailing space, which consumes all trailing white-space. Although the trailing white-space is not required, the function must internally see what the following character is before it can return; after all, it could be a white-space character. If the input is line-buffered, it means the next full line (or an end of input) must be input before the fscanf() can parse the previous line.
    Last edited by Nominal Animal; 12-08-2012 at 06:57 PM. Reason: Proper initialization for the 'chars' variable.

  6. #6
    Registered User
    Join Date
    Oct 2012
    Posts
    99
    I appreciate the help guys, but one strange thing is that this similar loop (that I also wrote for another problem) works okay with the EOF as far as I can tell.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <ctype.h>
    
    int main(void)
    {
        char ch;
        int upr_ct=0;
        int lwr_ct=0;
    
        printf("Enter characters/numbers.\n  To quit, Control-Z!\n");
        do
        {
            ch=getchar();
            if(islower(ch))
            {
                lwr_ct++;
            }
            else if(isupper(ch))
            {
                upr_ct++;
            }
        }while(ch!=EOF);
    
        printf("There are %d lower case and %d upper case letters", lwr_ct, upr_ct);
    
        return EXIT_SUCCESS;
    }
    What's the difference with my new code?

  7. #7
    Registered User
    Join Date
    Oct 2012
    Posts
    99
    Uh, actually is doesn't. I have to hit enter and THEN control-z.

  8. #8
    Registered User
    Join Date
    Oct 2012
    Posts
    99
    Both my codes were okay as long as I hit enter first, and then hit ctrl-Z

  9. #9
    Registered User
    Join Date
    Oct 2012
    Posts
    99
    My final code for your programming delight:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <ctype.h>
    
    int main()
    {
        char ch;
        char prev_ch='a';
        char first_ch;
        int counter=0;
        float word_counter=0;
        float letter_counter=0;
        float wpl_counter;
    
        printf("Enter words, hit enter\n and then type ctrl Z\n");
    
        ch=getchar();
        first_ch=ch;  //checks to see if first ch is a blank space for first and last blank space = a word overcount
    
        while(ch!=EOF)
        {
            if(counter!=0)
            {
                 ch=getchar();
            }
    
            counter++;
    
            if(isspace(ch))
            {
                if(isspace(prev_ch)&&isspace(ch))
                {
                    continue;  // doesn't count two spaces in a row as a new word
                }
                else if(ch=='\n')
                {
                    continue;
                }
                word_counter++;  //counts total # of words
                continue;         //skips counting whitespace as number of letters calculated below
            }
            else if(ispunct(ch))
            {
                continue;  //skips counting puncutation as number of letters calculated below
            }
            else  //ch must be a letter
            {
            letter_counter++;  //adds up total # of letters
            }
            prev_ch=ch;  // stores previous ch to check for two blank spaces in a row
        }
    
        if(isspace(first_ch)&&isspace(ch))
        {
            word_counter--;  //corrects for extra word count if the first and last ch are both blank spaces
        }
        else if(!(isspace(first_ch)&&isspace(ch)))
        {
            word_counter++;  //corrects for lack of counting a word if no space at start or end of input
        }
    
    letter_counter--;  //corrects for "\n" (return) entered
    wpl_counter=letter_counter/word_counter;
    
    printf("Total number of words is %.0f\n", word_counter);
    printf("Total number of letters is %.0f\n", letter_counter);
    printf("Total number of words per letter is %.2f\n", wpl_counter);
    
    return EXIT_SUCCESS;
    }

  10. #10
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    I did some debugging on the code in #6.

    I was concerned that char ch was doing something to affect the results. EOF is not actually a char, getchar does not return char, but int, so it was possible that the comparison was just munged. I ignored it the first time and entered "ab^Z". The code correctly counted the 2 lowercase characters. The inline ^Z was being munged to character 0x1a (substitute). Whereas when I entered the ^Z again after the enter key I got EOF proper.

    The results from the code appeared to be the same even after I correctly put "int ch" in.

    This could have something to do with how the console in windows was programmed. But in any case, getchar is line buffered so it doesn't really surprise me that a special key code like ^Z in the middle of a line wouldn't work the way you expected, just like how placing an integer -1 in the file, intending to truncate it, wouldn't work like you expected. Unbuffered input is the only way to properly read certain keys and it is the only way to react to every single key pressed when it is pressed.
    Last edited by whiteflags; 12-08-2012 at 08:00 PM.

  11. #11
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by whiteflags View Post
    I ignored it the first time and entered "ab^Z". The code correctly counted the 2 lowercase characters. The inline ^Z was being munged to character 0x1a (substitute). Whereas when I entered the ^Z again after the enter key I got EOF proper.
    This actually makes sense. Ctrl plus a letter is used to indicate ASCII control characters, where the code is 64 less than the equivalent letter. 'Z' - 64 == 26 == 0x1A.

    Perhaps the original engineers thought that since you're supplying input interactively, each input line should end with a newline. (It is a common assumption.) Therefore, end-of-file can only occur (interactively) at the start of a line. Elsewhere, Ctrl and a letter simply produces the corresponding ASCII control character.

    It should be easy to verify. If you are not at the start of a line, then Ctrl-A should produce ASCII code 1, Ctrl-J (or Ctrl-M then Ctrl-J) a newline, and Ctrl-_ (if you can produce that on your keyboard) ASCII code 0x1F.

Popular pages Recent additions subscribe to a feed