Thread: how many words are there in sentence

  1. #1
    Registered User
    Join Date
    Nov 2014
    Posts
    45

    how many words are there in sentence

    Hi everyone!

    I did a program which says how many words to us.
    But there is problem with my algorithm Because when I entered space character, it was saying wrong answer. Space character isnt a word

    Code:
    #include <stdio.h>#include <stdlib.h>
    
    
    int main(void) {
    char *c1;
    
    
    c1=(char *)malloc(39*sizeof(char));
    gets(c1);
    if(c1==NULL)
    exit(1);
    
    
    printf("%d",cml(c1));
    
    
    free(c1);
    
    
    return 0;
    }
    
    
    int cml(char *p)
    {
    	int i=0;
    
    
    	do
    	{
    		if(p[i]==' ')
    		i++;
    		p++;
    	}
    	while(p[i]!='\0');
    	return i+1;
    }

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    General notes:

    • Your formatting and indentation is crap. That makes it hard to read/follow what your code is doing. If it's hard to read, it's easy to make mistakes, and hard to find or fix them.
    • Also, what kind of name is cml? Is it short for "camel"? Why not give the function a name that actually describes what it's doing, like count_words? Same goes for variables.
    • Good job on checking the return value of malloc and on freeing the memory you allocate, but why bother allocating in the first place? Just use an array.
    • Don't use magic numbers, instead use a constant with a descriptive name. What is the significance 39? Did you remember to save a space for the null character?
    • Saving room for the null is pointless if you use gets, since gets is horrible. Read this FAQ > Why gets() is bad / Buffer Overflows - Cprogramming.com. Then, change your code to use fgets.
    • Why are you incrementing i and p? Do you understand what happens when you increment p? Do you see the compounding effect of incrementing both?
    • Ugh, nevermind the above. Now I see what your code is doing. i is a horrible name, it's usually used as a loop counter or array index. If you called it num_words, then I would know what you were trying to do.
    • Checking p[i] for '\0' is wrong. i is not the index of the character you are currently examining.
    • Returning i+1 seems wrong, but it's hard to say with all the other broken code.


    Of course your code only counts space characters, since that what you told it to do. That if statement basically says "if the current character is a space character, increment i". So it counts spaces. Instead, you need to figure out what a word really is, and understand what that means in terms of your code. Unless you have specific assignment requirements, I propose you use a simple definition of word: any contiguous sequence of non-space characters. To properly track this, you will need a flag variable that you can use to track when you are in a word, or not in a word. A good name for this variable would be in_word. You might consider using the isspace function (and possibly isalpha or isalphanum). Google for more info, and remember to #include <ctype.h>. See if that gets you going on the right track.

  3. #3
    Registered User
    Join Date
    Nov 2014
    Posts
    45
    @anduril462

    I'm sure , you're professional in programming language. Especially you're good at C language.
    I noticed your saying. Thanks your advices. I'm very happy. Because your advices are very very important. You are good guy!!

    I guess you talk about something like that this link
    Google C++ Style Guide

    Best wishes!!

  4. #4
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Yes, that is what I am referring to when I talk about formatting, indentation and readability. Your code, properly formatted, should look more like this:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    
    int main(void)
    {
        char *c1;
    
    
        c1= malloc(39 * sizeof(char));
        gets(c1);
        if (c1 == NULL)
            exit(1);
    
    
        printf("%d",cml(c1));
    
    
        free(c1);
    
    
        return 0;
    }
    
    
    int cml(char *p)
    {
        int i = 0;
    
    
        do {
            if(p[i] == ' ')
                i++;
            p++;
        } while (p[i] != '\0');
    
    
        return i+1;
    }
    Notice how much easier it is to see which lines belong to which if statments, loops, etc. Also, I added some spaces around the operators so it's easier to read.

    Lastly, I missed this in my first post, but don't cast malloc. Read this: FAQ > Casting malloc - Cprogramming.com. I removed the (char *) from your malloc call for you. Also, the sizeof(char) is redundant, the standard guarantees it will always be 1.

    You need to solve this yourself first -- work it out with paper and pencil using several example sentences (some starting or ending with spaces, some with just one word, some with no words, etc). After all, if you can't solve it yourself, you will never be able to program a computer to solve it.

    Try and work out an algorithm for count_words using the in_word flag variable, and if you get stuck, post your attempt here with specific questions on where you're stuck, and we'll help.

  5. #5
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158
    Quote Originally Posted by BlackDeath View Post
    I'm sure , you're professional in programming language. Especially you're good at C language.
    I noticed your saying. Thanks your advices. I'm very happy. Because your advices are very very important. You are good guy!!
    That was a fun read! If I may ask, what's your native language?


    Quote Originally Posted by anduril462 View Post
    Also, the sizeof(char) is redundant, the standard guarantees it will always be 1.
    Even so, I wouldn't discourage him from using it. It makes his intent of allocating an array of char's clear to the reader, which is very good practice. (Besides, the compiler optimizes it out, so it's zero-cost)

  6. #6
    Ultraviolence Connoisseur
    Join Date
    Mar 2004
    Posts
    555
    Quote Originally Posted by Yarin View Post
    Even so, I wouldn't discourage him from using it. It makes his intent of allocating an array of char's clear to the reader, which is very good practice. (Besides, the compiler optimizes it out, so it's zero-cost)
    I'd have to disagree with you there. Mainly because the standard idiom is not sizeof(type) but sizeof *ptr in which case it's not obvious that is a char pointer unless you omit that value.

    Consider:
    Code:
    struct object * p = malloc(39 * sizeof *p); /* Allocate space for 39 struct objects */
    char * s = malloc(39 * sizeof *s); /* indistinguishable without looking at type */
    char * m = malloc(39);  /* Can only be one thing, allocating space for a char * (or someone made a big mistake) */

  7. #7
    Ultraviolence Connoisseur
    Join Date
    Mar 2004
    Posts
    555
    Quote Originally Posted by Yarin View Post
    Even so, I wouldn't discourage him from using it. It makes his intent of allocating an array of char's clear to the reader, which is very good practice. (Besides, the compiler optimizes it out, so it's zero-cost)
    I'd have to disagree with you there. Mainly because the standard idiom is not sizeof(type) but sizeof *ptr in which case it's not obvious that is a char pointer unless you omit that value.

    Consider:
    Code:
    struct object * p = malloc(39 * sizeof *p); /* Allocate space for 39 struct objects */
    char * s = malloc(39 * sizeof *s); /* indistinguishable without looking at type */
    char * m = malloc(39);  /* Can only be one thing, allocating space for a char * (or someone made a big mistake) */

  8. #8
    Unregistered User Yarin's Avatar
    Join Date
    Jul 2007
    Posts
    2,158
    Quote Originally Posted by nonpuz View Post
    Mainly because the standard idiom is not sizeof(type) but sizeof *ptr in which case it's not obvious that is a char pointer unless you omit that value.
    Yes, that idiom is usually better. Though both are commonly used.


    Quote Originally Posted by nonpuz View Post
    /* Can only be one thing, allocating space for a char * (or someone made a big mistake) */
    Incorrect. If I encountered that in my code, I would, without context, assume it's an allocation for raw bytes (of type uint8_t or unsigned char).

  9. #9
    Registered User Alpo's Avatar
    Join Date
    Apr 2014
    Posts
    877
    Code:
      if(p[i]==' ')
      {        i++;
            p++;
      }
    It would probably be simpler to keep the index variable, and the counting variable separate. In truth, since you are using pointer arithmetic, you don't actually need to index into the array (I mean the pointer).


    Code:
        while( *p++ != '\0' )
        {
            if( *p == ' ' )
            {
                i++;
    
                // Guard against consecutive ' ' characters.
                while( *p++ == ' ' );
            }
        }
    Edit: Also "malloc( number * sizeof( *variable ) )" is the clearest to me. Although multiplying by 1 might be pointless I agree with Yarin about the intention being clearer, as you can tell what's going on just from the expression inside the malloc() call.
    Last edited by Alpo; 01-05-2015 at 11:20 PM.
    WndProc = (2[b] || !(2[b])) ? SufferNobly : TakeArms;

  10. #10
    Ultraviolence Connoisseur
    Join Date
    Mar 2004
    Posts
    555
    You do realize your algorithm fails if there are no spaces in the string? IE you should always add 1 to your result, unless the string is entirely empty.

  11. #11
    Registered User Alpo's Avatar
    Join Date
    Apr 2014
    Posts
    877
    Quote Originally Posted by nonpuz View Post
    You do realize your algorithm fails if there are no spaces in the string? IE you should always add 1 to your result, unless the string is entirely empty.
    Sure, also if a space comes first in a string it will give a false result as well. Ideally the algorithm should not just use spaces I think (you're either in a word or out, which should probably be tested with something like isalpha() or isalnum()). I was actually just trying to show you didn't need indexing, but I've written the following now (you don't need to add one):

    Code:
    int givecount( char* pStr )
    {
        int Count = 0;
    
        do
        {
            // Add to count, skip until not in word
            if( isalnum( *pStr ) )
            {
                ++Count;
    
                while( isalnum( *pStr ) && *pStr++ );
            }
    
        // Go until nul, or above condition is triggered
        } while( *pStr++ );
    
        return Count;
    
    }
    It's not masterfully crafted or anything, but it does seem to meet the need.
    WndProc = (2[b] || !(2[b])) ? SufferNobly : TakeArms;

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Rather than having to type everything every time, consider a test framework.
    Code:
    #include <stdio.h>
    int cml(char *p)
    {
      return 0;
    }
    
    int main()
    {
      char *tests[] = {
        "",
        "word",
        " word",
        "word ",
        " word ",
        "word word word",
        "word   word   word ",
      };
      size_t i;
      for ( i = 0 ; i < sizeof(tests)/sizeof(tests[0]) ; i++ ) {
        printf("String >>%s<< has %d words\n", tests[i], cml(tests[i]) );
      }
    
      return 0;
    }
    If you think of more test cases, it's dead easy to add them to the array.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #13
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Bonus addition to Salem's testing code: store the expecte number of words along with the test sentences, and have the test code compare and print results for you:
    Code:
    struct words_test {
        char *sentence;
        int expected_words;
    };
    
    struct words_test tests[] = {
        {"", 0},
        {"word", 1},
        ...
        {"word  word  word", 3}
    };
    // in the loop
    int num_words = count_words(tests[i].sentence)
    if (num_words != tests[i].expected_words)
        printf("ERROR: >>%s<< expected %d words, got %d\n", tests[i].sentence, tests[i].expected_words, num_words);
    else
        printf("SUCCESS: >>%s<< has %d words\n", tests[i].sentence, num_words);

  14. #14
    Registered User
    Join Date
    Nov 2014
    Posts
    45
    Hello again. Thanks all of you for your helping.

    Yarin My nationally is secret. I am so sorry for this. But My English is not good, but I can read an article fluently.
    Here is the my code.
    Please evaluate my code.
    Code:
    #include <stdio.h>#include <stdlib.h>
    int wordCounter(char *);
    int main() {
    	
    	char a[50];
    	fgets(a,50,stdin);
    	printf("%d",wordCounter(a));
    	return 0;
    }
    
    
    int wordCounter(char *word)
    {
    	int flag=0;
    	int counter=0;
    	for(;*word!='\0';word++)
    	{
    		if(*word!=' ' && flag==0)
    		{
    			++counter;
    			flag=1;	
    		}
    		else if(*word==' ')
    		{
    			flag=0;
    		}
    	}
    	return counter;
    }

  15. #15
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Much better overall. Good, consistent formatting and indentation, it looks like a correct implementation, though I wont bother testing it since Salem and I gave you example code to test it and verify it works.

    I already mentioned this, but here it is again, with more details in case you had trouble understanding the first time. Your variable names are still horrible. Give them names that describe the data they store, or what their purpose is. Also, don't be lazy, i.e. don't pick a short name to avoid typing. Never, ever sacrifice clarity to save a few keystrokes.

    • a tells you nothing about the variable. reserve single-letter variable names for array indices, loop counters and mathematical/scientific calculations where it mimics a standard variable/constant used in that field. Instead, call it something like sentence.
    • word is sufficiently descriptive, but it's inaccurate. The parameter is not a word, but a sentence, or phrase, or some text. word implies there is a single word in that parameter. If there was a single word, then I would expect word to always return a value of 1.
    • flag tells you a little bit about the variable, but is only a tiny bit more useful than naming an int variable "number". It just reaffirms the type of the variable, but omits it's higher-level usage and purpose. What is it a flag of? What would you do if you needed two or more flag variables in one function? How about calling it in_word.
    • counter is about as useful as flag. What is it counting? How about word_count or num_words.


    fgets stores the newline (from when the user presses enter at the end of inputting their sentence). You may want to trim it (see the strchr method here).
    You should strongly consider functions like isspace(), isalpha() and isalnum() for checking if a character is a valid word or not. For example, what would your code do if there was a tab character between two words? What if a sentence had a stand-alone punctuation mark, surrounded by spaces, like a - dash or & ampersand?

    Always use constants instead of magic numbers.
    Code:
    #define MAX_SENTENCE_LEN 256  // 50 is not very long
    char sentence[MAX_SENTENCE_LEN];
    Use sizeof in fgets. That way, if you change the size of the sentence variable, you wont have to worry about changing (or forgetting to change) the fgets call.
    Code:
    char sentence[MAX_SENTENCE_LEN];
    fgets(sentence, sizeof(sentence), stdin);
    Purely a matter of style/convenience: if you place the wordCounter function above main, you will no longer need to declare a prototype.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reverse the words of a sentence.
    By Mr.Lnx in forum C Programming
    Replies: 9
    Last Post: 04-24-2012, 02:15 AM
  2. how to split a sentence to words
    By sivapc in forum C++ Programming
    Replies: 13
    Last Post: 09-28-2009, 01:21 AM
  3. Dividing sentence into words
    By Lord CyKill in forum C Programming
    Replies: 1
    Last Post: 11-02-2003, 07:25 AM
  4. How to count the words in sentence ?
    By Th3-SeA in forum C Programming
    Replies: 1
    Last Post: 10-01-2003, 01:34 AM
  5. Searching for words within a sentence
    By drdroid in forum C++ Programming
    Replies: 4
    Last Post: 02-27-2003, 12:09 AM