Thread: number of words in a string

  1. #1
    Registered User
    Join Date
    Feb 2009
    Posts
    33

    number of words in a string

    hi everyone, i made this code but i'm not sure if it's 100% correct

    Code:
    #include <stdio.h>
    #include <string.h>
    
       int count(char x[]){
    
              int sum=0,j,cw=0;
    
              for (j=cw;j<=strlen(x)-1;j++){
    
              while(x[j]!=' '){
              cw++;
              j++;
                }
    
             if (x[j+1]!=' ' && !isdigit(x[j+1])){
                              sum++;
                      }
            }
    
             if (x[0] == ' ' || isdigit(x[0])){
                             sum--;
                   }
    
      return sum;
    
    }
    
    int main(void){
    
        char x[1000] = " a b c d ef g 90123012031203 hi jjj 0210312093091230123";
    
        printf("%d\n",count(x)); //the result is 8
    
        getchar();
    
       return 0;
    
    }
    it prints me the correct results for different input, but is there any other better/smarter way to do this?

    thanks

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    In most cases, your correct answer would be 10 with your input. Not that 'c' is a word, but unless you're checking with a dictionary, you have to count it as one.

    The indentation is very poor, but ++j, being done separately inside a for loop with j as the counter, is usually an error, right there. Why would you do that?

    So "90210" isn't going to count as a word? "Fahrenheit 451" is only going to be one word? Better re-think that. Adjacent numbers have to be counted as a word if they have a space before them, and either punctuation (including newlines), or a space, after them.

    You've Rube Goldberg'd this. All you need is a single for or while loop, with a few if statements inside.

    That's it.

  3. #3
    Registered User
    Join Date
    Feb 2009
    Posts
    33
    Quote Originally Posted by Adak View Post
    In most cases, your correct answer would be 10 with your input. Not that 'c' is a word, but unless you're checking with a dictionary, you have to count it as one.

    The indentation is very poor, but ++j, being done separately inside a for loop with j as the counter, is usually an error, right there. Why would you do that?

    So "90210" isn't going to count as a word? "Fahrenheit 451" is only going to be one word? Better re-think that. Adjacent numbers have to be counted as a word if they have a space before them, and either punctuation (including newlines), or a space, after them.

    You've Rube Goldberg'd this. All you need is a single for or while loop, with a few if statements inside.

    That's it.
    thanks for the reply, it was my fault not to say that we dont count numbers as words

    new line characters, commas etc and punctuations will not be included in the string

    also, why do you say that j++ is an error? i use it in order to make less checkings in the string

    can you tell me exactly what you mean? maybe some extra code would help me understand
    Last edited by jackhasf; 12-26-2009 at 01:03 PM.

  4. #4
    Registered User
    Join Date
    Feb 2009
    Posts
    33
    also for different inputs i get

    " 1234 Hello world 2340923042 how are you " = 5


    " 1234 Hello world asdasdasdasd asdas das da sd asd 2340923042 how are you " = 11

  5. #5
    Registered User
    Join Date
    Dec 2009
    Location
    Rome
    Posts
    7
    maybe you should take in consideration the function strtok

  6. #6
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by jackhasf View Post
    thanks for the reply, it was my fault not to say that we dont count numbers as words

    new line characters, commas etc and punctuations will not be included in the string

    also, why do you say that j++ is an error? i use it in order to make less checkings in the string

    can you tell me exactly what you mean? maybe some extra code would help me understand
    I can see that you don't count numbers as words. My point is that the book title "Fahrenheit 451" has more than one word in it's title.

    Obviously, newlines, etc., are not included in the string, but they mark the end of a word.

    "...word." <<== for example

    j++ is the counter in the for loop. Your logic is nearly always goofed if you are messing with the counter, inside a for loop. And since your word count is wrong...

    You have two states for your char counter, it's either inside a word, or it's outside of a word, at any instant.

    When the array[i] = any char and your state is currently outside, then you know you need to switch the state to inside, and add a word to the word count.

    When your array[i] = a space, a punctuation, or a newline, then you need to switch the state from inside, to outside again.

    These tests can be done with just a few if statements, right inside your ONE for loop.

    STOP **THINKING**, Dammit! Just listen and **DO** it. The answer is so clear, it will hit you right in the chops!

    This is a simple simon exercise. You don't need strtok() or any other library. IT"S TOO SIMPLE, for gawd's sake!

    Remember: one for loop, and a few if statements. That's ALL!!

    I'm waiting for the day when a guy posts his pet chicken pecking out this program, on YouTube.

  7. #7
    Registered User
    Join Date
    Feb 2009
    Posts
    33
    thanks again for the reply

    Quote Originally Posted by Adak View Post
    I can see that you don't count numbers as words. My point is that the book title "Fahrenheit 451" has more than one word in it's title.
    True, but, i don't want to count numbers, that's why I did what I did and in this program if you have the fahrenheit 451 input, you will get 1 as a result, which is the correct result.
    Quote Originally Posted by Adak View Post
    Obviously, newlines, etc., are not included in the string, but they mark the end of a word.
    As I know '\0' marks the end of a string, and i dont really need to know where it is cause I use strlen() function.
    Quote Originally Posted by Adak View Post
    "...word." <<== for example
    as i said before, string can't and won't have punctuation, it's what the problem states.
    Quote Originally Posted by Adak View Post
    exercice
    j++ is the counter in the for loop. Your logic is nearly always goofed if you are messing with the counter, inside a for loop. And since your word count is wrong...
    why is it wrong? I get the correct results, I can't really understand what's wrong in my code, since for every kind of input, I get the correct answer, I had a sheet of paper with me while doing this, and I can say I'm 99% sure that my logic is not goofed.
    Last edited by jackhasf; 12-26-2009 at 02:50 PM.

  8. #8
    Registered User
    Join Date
    Dec 2009
    Location
    Rome
    Posts
    7
    talking about errors of your code:
    in
    Code:
              while(x[j]!=' '){
              cw++;
              j++;
                }
    you should make a check that j will not go over the lenght of the string

    there are probabilities that you would get a segmentation fault or erroneous output

    same check when you write

    Code:
    if (x[j+1]!=' ' && !isdigit(x[j+1])){
    ----------

    adak already give a better solution, using a single loop, checking every step these things:
    1 - if the actual character is a letter from 'a' to 'z' or from 'A' to 'Z', if not continue to loop
    2 - increment the words_count every time you enter in a word (a series of letters of the alphabet)
    3 - incrementing the index_loop and checking is not out of bound

  9. #9
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    One really, really easy way to think of a word is just as non-whitespace characters surrounded by spaces (or the ends of the string). With that in mind, I think something as simple as this might work for you . . .
    Code:
    int word_count(const char *string) {
        int wc = 0;
        int x = 0;
        for(;;) {
            /* skip whitespace, if any */
            while(string[x] == ' ') x ++;
    
            /* stop if we've reached the end of the string */
            if(!string[x]) break;
    
            /* skip over the word */
            while(string[x] != ' ' && string[x]) x ++;
    
            /* skip whitespace, if any */
            while(string[x] == ' ') x ++;
            
            /* increment word count */
            wc ++;
        }
        return wc;
    }
    [Coloured with codeform.]

    Of course, there are other ways (perhaps even simpler ones). But hopefully that will give you some ideas. Note that I haven't test it, so no guarantees.

    [edit] As others have suggested, it's probably better not to mess with the loop variable inside the loop, as I did (trying to make the code a little similar to yours). Another easy way to do this is to remember the character you saw during the last iteration of the loop (specifically, whether it was a space or not). Then just take note when you change from space to non-space, and from non-space to space. In the former case, you have the beginning of a word, so count it. For example:
    Code:
    char last_char = ' ';
    for(x = 0; x < length; x ++) {
        if(last_char == ' ' && string[x] != ' ') {
            wc ++;
        }
        last_char = string[x];
    }
    [/edit]
    Last edited by dwks; 12-26-2009 at 02:48 PM.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  10. #10
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    If you don't want to learn a simpler, slightly more efficient, and more robust way of doing this, then I don't understand why you posted here.

    You've got a program you're happy with. It's giving you the right answers if you're not counting numbers. You don't want to change anything.

    OK.

    As the saying goes: You can't taste my tea, without first emptying your cup.

  11. #11
    Registered User
    Join Date
    Feb 2009
    Posts
    33
    Quote Originally Posted by Adak View Post
    If you don't want to learn a simpler, slightly more efficient, and more robust way of doing this, then I don't understand why you posted here.

    You've got a program you're happy with. It's giving you the right answers if you're not counting numbers. You don't want to change anything.

    OK.

    As the saying goes: You can't taste my tea, without first emptying your cup.
    ofcourse i wanted to change things, but you kept saying that my code and thinking was wrong without telling me why

  12. #12
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by jackhasf View Post
    ofcourse i wanted to change things, but you kept saying that my code and thinking was wrong without telling me why
    I wanted you to *DISCOVER* what I was showing you - not be told, not by analyzing a function I wrote for you. Just take a few pointed suggestions, and with what you already know, you'd run right into it.

    And be a bit amazed, perhaps.

    If I had said your code and algorithm was great, why would you want to change it? What would be my argument to support that change?

    That makes no sense to me.

    I believe you should have taken my word for something - and gave it a shot.

    Which is all OK, by the way. You don't know me, and there's no reason you should trust me to give you good advice on your program.

    Be well.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unable to compare string with 'getter' returned string.
    By Swerve in forum C++ Programming
    Replies: 2
    Last Post: 10-30-2009, 05:56 PM
  2. char Handling, probably typical newbie stuff
    By Neolyth in forum C Programming
    Replies: 16
    Last Post: 06-21-2009, 04:05 AM
  3. Message class ** Need help befor 12am tonight**
    By TransformedBG in forum C++ Programming
    Replies: 1
    Last Post: 11-29-2006, 11:03 PM
  4. Calculator + LinkedList
    By maro009 in forum C++ Programming
    Replies: 20
    Last Post: 05-17-2005, 12:56 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM