Thread: what would I use to count number of words

  1. #1
    Registered User
    Join Date
    Apr 2007
    Posts
    45

    what would I use to count number of words

    quick question, I have a string

    Code:
    int main ( )
    {
    char string [ 25 ] = "Hello My  Name  is   Ben";
    
    }
    How would I count the number of words in the string? I tried counting the number of spaces but if they put more than one space between each word It would throw me off.

    Thanks in advance.

  2. #2
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    How would you do it yourself?

    Count spaces, but ignore all spaces preceeded by a space.

  3. #3
    Registered User
    Join Date
    Apr 2007
    Posts
    45
    how in the heck would you ignore spaces after a space?

  4. #4
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    Here's some pseudo-pseudocode:

    Code:
    lastchar = '\0'
    Loop for length of string
    	if currentchar = ' ' AND lastchar != ' '
    		count++;
    	lastchar = currentchar
    Edit: Oh, and you probably want to add one to count when you're done. But even then, this is making some assumptions about the string it'll be analyzing.
    Last edited by MacGyver; 04-27-2007 at 12:48 PM.

  5. #5
    Registered User
    Join Date
    Apr 2007
    Posts
    45
    Isnt there a way to break up a string into tokens?

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Yes, with strtok().
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Apr 2007
    Posts
    45
    and you use strtok how?

  8. #8
    Registered User
    Join Date
    Apr 2007
    Posts
    141
    Think of the problem in terms of state machines. For simplicity I'll describe a Moore type state machine that would do the trick. (A Mealy type would be more efficient and I recommend you convert this to Mealy.) In each state there will be an action that is performed (e.g. increment the word count) and a set of conditions that dictate transitions to new states.

    Here are some proposed states:

    StartWord: (Occurs when we first detect a word)
    Action: increment word counter
    whitespace: go to White state
    ascii: if not white go to Word state

    White: (Occurs when we are processing white characters (e.g. space, tab newline etc.))
    Action: none
    whitespace: go to White state
    ascii: go to StartWord

    Word: You can figure this out yourself.


    C implementation. The standard and inefficent approach will be to implement this using the switch statement. The state values would be determined via #define statements

    switch(state) {
    case StartWord:
    wordcount++ ;
    if ((c == ' ') || (c == '/t') || (c == '\n')) {
    state = White ;
    } else {
    state = Word ;
    }
    break
    case White:
    ....
    break ;
    case Word:
    ...
    break ;
    }
    The superior approach is to use the dreaded goto statement to handle the state transitions. c would contain the current character and the whole mess would be wrapped in a while loop, whilst reading one character at a time from the input stream (e.g. using getchar() or scanf etc.). Although goto is strongly discouraged in structured programming, it makes perfect sense for a state machine and really is not any harder to read than a switch statement for this purpose. It's when go tos jump out of loops that things get messy.

  9. #9
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    hmm... have you not tried to search?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  10. #10
    Registered User Noir's Avatar
    Join Date
    Mar 2007
    Posts
    218
    Although goto is strongly discouraged in structured programming, it makes perfect sense for a state machine and really is not any harder to read than a switch statement for this purpose.
    I don't believe that. Can you prove it?

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by SevenThunders View Post
    The superior approach is to use the dreaded goto statement to handle the state transitions. c would contain the current character and the whole mess would be wrapped in a while loop, whilst reading one character at a time from the input stream (e.g. using getchar() or scanf etc.). Although goto is strongly discouraged in structured programming, it makes perfect sense for a state machine and really is not any harder to read than a switch statement for this purpose. It's when go tos jump out of loops that things get messy.
    You don't need goto...

    Code:
    #include <ctype.h>
    
    #define IN_WORD 0
    #define IN_SPACE 1
    
    int count_words(const char *str)
    {
        int nwords = 0;
        int state;
    
        if(!*str) return 0;
        if(isspace(*str)) state = IN_SPACE;
        else
        {
            state = IN_WORD;
            nwords = 1;
        }
        while(*++str)
        {
            switch(state)
            {
            case IN_SPACE:
                if(!isspace(*str))
                {
                    state = IN_WORD;
                    nwords++;
                }
                break;
            case IN_WORD:
                if(isspace(*str)) state = IN_SPACE;
                break;
            }
        }
        return nwords;
    }

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by MacGyver View Post
    How would you do it yourself?

    Count spaces, but ignore all spaces preceeded by a space.
    Wouldn't work. Take the simple rule of counting the spaces and then adding one. There is one space in "Two Words," 1+1 = 2, so there are 2 words. It's simple to add the rule "except when a space is preceded by another space." But if the input string is two spaces and nothing else, there is one space which is not preceded by a space, so there is 1 counted space. Add 1 to this according to the aforementioned rule, and you get the result that a string consisting only of 2 or more spaces contains 2 words.

  13. #13
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Here's a recursive example. I don't actually recommend using it (it's a silly waste of stack space for this kind of problem), but it shows how recursion can make things much easier at least in principle:

    Code:
    int count_words(const char *str)
    {
        return count_words_r(str, ' ');
    }
    
    /* Recursive worker function */
    int count_words_r(const char *str, char last)
    {
       if(!*str) return 0;
       return count_words_r(str + 1, *str) + ((!isspace(*str) && isspace(last)) ? 1 : 0);
    }
    Last edited by brewbuck; 04-27-2007 at 01:52 PM.

  14. #14
    Registered User
    Join Date
    Apr 2007
    Posts
    45
    My program should at the moment read from a file called input, create two new files called east and west. And then print Roster number x for each player. Right now I am working on how to break up each word in the input file into tokens. and print them off in this order.

    Roster# Lastname, Firstname, Hometeam, Scoring average

    my input file is

    Code:
    Joseph Stevens Boys 11 20 5
    Jamie Stevens Girls 22 18 6
    Tony Waters Boys 33 22 7
    Sally Smith Girls 44 33 8
    My program is:

    Code:
    int main ( void )
    {
    char str [ 80 ];
    char first;
    char last;
    char *token;
    int points;
    int games;
    int number;
    int x;
    int roster;
    int y;
    int a = 0;
    FILE * input, * east, * west;
    char array [ 81 ];
    char string [ 80 ] = "Hello My Name is Ben";
    
    if ( ( input = fopen ( "input", "r" ) ) == NULL )
    {
            printf ( "Cannot open file\n" );
            exit ( 1 );
    }
    
    if ( ( east = fopen ( "east", "w" ) ) == NULL )
    {
            printf ( "Cannot open file\n" );
            exit ( 1 );
    }
    
    if ( ( west = fopen ( "west", "w" ) ) == NULL )
    {
            printf ( "Cannot open file\n" );
            exit ( 1 );
    }
    
    "mylab8.c" 73 lines, 1259 characters
    }
    
    x = -1;
    while ( ! feof ( input ) )
            {
            fgets (str, 80, input );
            x ++;
            }
    y = x + 1;
    for (roster = 1; roster < y; roster ++)
            {
            fprintf (east, "Roster# is %d\n", roster);
            fprintf (west, "Roster# is %d\n", roster);
            }
    fgets (str, 80, input );
    
    stringtoken ( str );
    
    }
    
    
    int stringtoken ( char string )
    
    {
    char *token;
    
       printf( "Tokens:\n" );
       /* Establish string and get the first token: */
       token = strtok( string, " ,\n\t" );
       while( token != NULL )
       {
          /* While there are tokens in "string" */
          printf( " %s\n", token );
          /* Get next token: */
          token = strtok( NULL, " ,\n\t" );
       }
    }
    And my error when I try to compile is

    mylab8.c:60: warning: type mismatch with previous implicit declaration
    mylab8.c:53: warning: previous implicit declaration of `stringtoken'
    mylab8.c: In function `stringtoken':
    mylab8.c:65: warning: assignment makes pointer from integer without a cast
    mylab8.c:71: warning: assignment makes pointer from integer without a cast

  15. #15
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by s_jsstevens View Post
    My program should at the moment read from a file called input, create two new files called east and west. And then print Roster number x for each player. Right now I am working on how to break up each word in the input file into tokens. and print them off in this order.
    Why are you posting this here?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. xor linked list
    By adramalech in forum C Programming
    Replies: 23
    Last Post: 10-14-2008, 10:13 AM
  2. Nim Trainer
    By guesst in forum Game Programming
    Replies: 3
    Last Post: 05-04-2008, 04:11 PM
  3. adding a number to a number
    By bigmac(rexdale) in forum C Programming
    Replies: 11
    Last Post: 10-24-2007, 12:56 PM
  4. Please Explain Count the number of bits in an int code
    By dnysveen in forum C++ Programming
    Replies: 36
    Last Post: 12-23-2006, 10:39 PM
  5. counting the number of words
    By ccoder01 in forum C Programming
    Replies: 3
    Last Post: 05-29-2004, 02:38 AM