Thread: Counting words, lines, and characters in a text file

  1. #1
    Registered User
    Join Date
    Apr 2016
    Posts
    2

    Counting words, lines, and characters in a text file

    Hello. I am new to programming and I am kinda of stuck on this code I have written. I am trying to count lines, words and characters in a file. I am having sorta of trouble counting the words and have been trying to figure it out for a couple of hours now. I know to some it may be easy.

    Here is the code.
    Code:
    int main (int argc, char *argv[]){
        FILE *input;
        int character, newword,newline, state;
        int c;
    
        state = OUT;
        character  = newword = newline= 0;
        input = fopen(argv[1], "r");
    
        if ( input == NULL){
            printf("Error! Can not read from file\n");
            exit(-1);
        }
        if (argc !=2){
            printf("Not enough arguments provided\n");
            exit(-1);
        }
    
        if (argc>2){
            printf("Too many arguments were provided\n");
            exit(-1);
        }
    
    
        while ((c = fgetc(input)) != EOF){
    
            if ( c == '\n'){
                state= OUT;
                newline++;
            }
    
            if ( c >='a' && c<= 'z'){
                state = IN;
                character++;
            }
    
            if (c >='A' && c<= 'Z'){
                state = IN;
                character++;
            }
    
            if ( c >= '0' && c<='9'){
                state = IN;
                character++;
            }
            else {;}
    
                if (c == ' ' ||c == '\n'|| c == '\t'){
                    state = OUT;
                    newword++;
                }
    
            }
        printf("The number of lines: %d\n",newline);
        printf("The number of words: %d\n", newword);
        printf("The number of characters: %d\n", character);
        fclose(input);
    }
    In the text file, there is the following text,

    J.K. Rowling's Harry Potter

    It is suppose to print out:
    The number of lines: 1

    The number of words: 6

    The number of characters: 21


    However, my code makes it print out

    The number of lines: 1

    The number of words: 4

    The number of characters: 21




    Any suggestions?
    Also, another bug I found is when I leave the text file with just

    !

    my code makes it seem as if ! is counted as a word.

    Any suggestions to fix this as well?
    Everything else is good, only the number of words is the problem.

    Thank you!

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Some points.

    > input = fopen(argv[1], "r");
    You should complete ALL your argc checks before doing this.
    If argc is wrong, then the code is already broken.

    You do a lot of
    state = IN
    state = OUT


    but nowhere do you do
    if ( state == IN )
    if ( state == OUT )

    why are you bothering with assigning state, if you never use the result?


    Your empty else statement looks very suspect.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Mar 2016
    Posts
    203
    Since newline =0, I'm surprised you get printout #lines = 1, shouldn't it be 0 for your .txt file?
    In a similar vein, #words should be newword + 1 since 2 words are separated by one ' ', etc.
    int state = OUT? Since you're using state as a flag, better have int state = 0
    also, char c smells better (at least to me)

    Code:
    int main ()
    
    
    {
        FILE *input;
        int character, newword, newline, state;
        char c;
    
    
        state = 0;
        character  = newword = newline = 0;
        input = fopen("F:\\test_file.txt", "r");
    
    
        if ( input == NULL){
            printf("Error! Can not read from file\n");
            exit(1);
        }
        
    
    
        while ((c = fgetc(input)) != EOF){
    
    
            if ( c == '\n'){
                state= 0;
                newline++;
            }
    
    
            if ( c >='a' && c<= 'z'){
                state = 1;
                character++;
            }
    
    
            if (c >='A' && c<= 'Z'){
                state = 1;
                character++;
            }
    
    
            if ( c >= '0' && c<='9'){
                state = 1;
                character++;
            }
            else {;}
    
    
                if (c == ' ' ||c == '\n'|| c == '\t'){
                    state = 0;
                    newword++;
                }
    
    
            }
        printf("The number of lines: %d\n",newline+1);
        printf("The number of words: %d\n", newword+1);
        printf("The number of characters: %d\n", character);
        fclose(input);
    }

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    > also, char c smells better (at least to me)
    Which results in code which is wrong.

    getchar() returns 257 different values (for 8 bit chars); there are the 256 chars, and also EOF.

    The result can't be stored correctly in a char, which is why it is an int.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Apr 2016
    Posts
    2
    Quote Originally Posted by Salem View Post
    Some points.

    > input = fopen(argv[1], "r");
    You should complete ALL your argc checks before doing this.
    If argc is wrong, then the code is already broken.

    You do a lot of
    state = IN
    state = OUT


    but nowhere do you do
    if ( state == IN )
    if ( state == OUT )

    why are you bothering with assigning state, if you never use the result?


    Your empty else statement looks very suspect.
    Hey. I used the state == IN and out because I remember saying it from a book that I read and used that as a template, but instead I used it for a file.

  6. #6
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    Quote Originally Posted by codexer77 View Post
    Hey. I used the state == IN and out because I remember saying it from a book that I read and used that as a template, but instead I used it for a file.
    The point is that you set state to IN or OUT at various points, but you never test it in an if statement, therefore it's totally useless. Turning up your compiler warning level may warn you about a variable that is "set but not used". On gcc you can get full warnings with the flags -W -Wall -pedantic.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 2
    Last Post: 12-17-2013, 11:50 AM
  2. Replies: 4
    Last Post: 06-16-2012, 12:51 PM
  3. Characters, words and lines counting
    By Darkobra in forum C Programming
    Replies: 4
    Last Post: 01-09-2012, 07:13 AM
  4. How would I go about counting the lines in a text file?
    By mickpc in forum Linux Programming
    Replies: 7
    Last Post: 10-07-2009, 07:21 AM
  5. counting words in a text file...
    By flightsimdude in forum C Programming
    Replies: 10
    Last Post: 09-19-2003, 07:02 PM

Tags for this Thread