Thread: Counting input words

  1. #1
    Registered User
    Join Date
    Jan 2008
    Posts
    11

    Counting input words

    So, this is the assignment:

    3-3. Write a program to count how many times each distinct word appears in its input.

    You might recognize it from Accelerated C++.

    I have come as far as to initialize the vector and giving it the same amount of elements as the input:

    Code:
        vector<string> skrivning;
        double x;
        while (cin >> x)
        {
              skrivning.push_back(x);
        }
    What I'm not so sure of how to do is to actually find the words and pick out their letters to see what they really contain. Any help?

  2. #2
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Perfect problem for a map<string,int> container.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  3. #3
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    A map is great for this, but if you haven't learned map in the book yet, have you learned sort?

  4. #4
    Registered User
    Join Date
    Jan 2008
    Posts
    11
    Quote Originally Posted by Daved View Post
    A map is great for this, but if you haven't learned map in the book yet, have you learned sort?
    You're right, map hasn't been introduced yet, though sort has. I'm guessing what you're getting at is to use sort to organise the strings in a decreasing (or non-decreasing) order, but how would I pick them out in a proper way?

  5. #5
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    Once they're sorted, you can just go through the vector from start to finish and keep a count. If the current word is the same as the previous word, then increment the count. If it is different than the previous word, then restart the count for the new word.

  6. #6
    Registered User
    Join Date
    Jan 2008
    Posts
    11
    Code:
    #include <algorithm>
    #include <iostream>
    #include <string>
    #include <vector>
    
    using namespace std;
    
    int main()
    {
        // initialize the vector
        vector<string> skrivning;
        double x;
        cout << "Enter a sentence of your choice.";
        while (cin >> x)
        {
              skrivning.push_back(x);
        }
        
        // sort and initialize neccesary variables
        sort(skrivning.begin(), skrivning.end());
        int u, count = 0;
        int lastWord = skrivning[u-1];
           
        // count the word appearances
        for(u = 0; u < skrivning.size(); u++)
        {         
              if (u == 0)
              {
              break;
              } 
              else if(skrivning[u] == lastWord)
              {
    This is about as far as I've come. Am I on the right track? I can't think of a way to keep that count for each word correctly other than in another vector, and how would I sort the counts out once stored?

  7. #7
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    Do you need to sort the counts? I don't think you do. Do you even need to store the counts? I think that when you find that skrivning[u] != lastword you can just output lastword and the count before starting the new count.

    Also note that you have a few bugs in the code as it is. Make sure to initialize u if you keep it, you probably want continue instead of break, and the first value for lastword should be skrivning[0] if you're going to initialize it with something (you can also get it to work with it starting out empty if you want).

  8. #8
    Registered User
    Join Date
    Jan 2008
    Posts
    11
    Quote Originally Posted by Daved View Post
    Do you need to sort the counts? I don't think you do. Do you even need to store the counts? I think that when you find that skrivning[u] != lastword you can just output lastword and the count before starting the new count.

    Also note that you have a few bugs in the code as it is. Make sure to initialize u if you keep it, you probably want continue instead of break, and the first value for lastword should be skrivning[0] if you're going to initialize it with something (you can also get it to work with it starting out empty if you want).
    The reason why I check if u == 0 is that I don't want the the loop to check for a match in the last word if it's the first word anyway, that'd be kind of pointless.

    First value for lastword? I use it to compare skrivning with the last word read, which is skrivning[u-1], so why shouldn't I initalize lastword to what's stated in the code above?

  9. #9
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    I know what you're doing. But the things I mentioned still apply.

    The check for u== 0 is fine, but you don't want break, that will end the loop, you want continue to skip the rest of the loop and start again.

    It's fine to store skrivning[u-1] in lastword, but not at the beginning. At the beginning u is uninitialized, so u-1 could be anything. Even if you initialize it to 0, u-1 would be -1, which is not valid.

    Remember, just because you make an assignment of lastword = skrivning[u-1], doesn't mean that lastword will get updated automatically every time u changes. C++ doesn't work like that. Execution goes in order from top to bottom and the assignment only happens that one time. You have to assign a new value to lastword each time u changes or you find a new word.

  10. #10
    Registered User
    Join Date
    Jan 2008
    Posts
    11
    Code:
    #include <algorithm>
    #include <iostream>
    #include <string>
    #include <vector>
    
    using namespace std;
    
    int main()
    {
        // initialize the vector
        vector<string> skrivning;
        string x;
        cout << "Enter a sentence of your choice." << endl;
        while (cin >> x)
        {
              skrivning.push_back(x);
        }
        
        // sort and initialize neccesary variables
        sort(skrivning.begin(), skrivning.end());
        int u = 0;
        string lastWord;
           
        // count the word appearances
        for(u = 0; u < skrivning.size(); u++)
        {         
              if(u == 0)
              {
              cout << skrivning[0] << endl;
              } 
              else if(skrivning[u] != lastWord)
              {
                    cout << skrivning[u] << endl;
                    lastWord = skrivning[u-1];
              } else {
                    cout << skrivning[u] << " " << lastWord << " Match!" << endl;              
              }
        }
        
        system("PAUSE");
        return 0;
    }
    Now I guess my problem is where to initialize the first lastWord = skrivning[u-1]. It doesn't fit anywhere it seems; putting it in the first line of the loop causes the program to crash.

    Also when there are more than two similar words entered, the program doesn't catch the third one.

  11. #11
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    You can set lastWord equal to the first string in the vector, and then start your loop with the second vector entry (if there is one, that it). Then, there's no need for test of u==0.

    You prompt says "enter sentences", but your logic appears to be geared for "words". I see no point in sorting sentences.

    Todd

  12. #12
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    Each word in the sentence will be added to the vector separately, so it will work fine. Of course, the input only stops when the user inputs Ctrl-Z or Ctrl-D or whatever sequence indicates end-of-file on a console application, so maybe "text" or "paragraph" would be better than sentence.

  13. #13
    Registered User mikeman118's Avatar
    Join Date
    Aug 2007
    Posts
    183
    What I've done in the past is I've sorted the vector, used vector.count() to find all how many times the first element appears. I then store that, and erase all of those that are the same. I then proceed to do the same procedure until I reach the end. I'm sure that the other ways would work as well, but I just thought I'd put in my two cents.

  14. #14
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Simplified. (I had to add a break since I can't figure out how to simulate EOF via the XCode Run Log... )
    Code:
    #include <algorithm>
    #include <iostream>
    #include <string>
    #include <vector>
    
    using namespace std;
    
    int main()
    {
        // initialize the vector
        vector<string> skrivning;
        string x;
        cout << "Enter a sentence of your choice." << endl;
        while (cin >> x )
        {
    		if (x=="X") break ; 
    		skrivning.push_back(x);
        }
        
        // sort and initialize neccesary variables
    	sort(skrivning.begin(), skrivning.end());
        int u = 0;
    	int word_count = 1 ; 
    
        for(u = 0 ; u < skrivning.size()-1; u++)
        {         
    		if (skrivning[u] == skrivning[u+1] ) 
    		{
    			word_count++ ; 
    		} 
    		else 
    		{
    			cout << "Word is " << skrivning[u] << ", count is " << word_count << endl ; 
    			word_count = 1 ; 
    		}
        }
    	cout << "Word is " << skrivning[skrivning.size()-1] << ", count is " << word_count << endl ;
    	system("PAUSE");
        return 0;
    }
    Todd

  15. #15
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    (and it doesn't look for no words entered.)

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. counting lines , chars from an input file need help
    By Mshock in forum C++ Programming
    Replies: 26
    Last Post: 07-12-2006, 11:30 AM
  2. Trouble with a lab
    By michael- in forum C Programming
    Replies: 18
    Last Post: 12-06-2005, 11:28 PM
  3. Words waiting in input buffer
    By the person in forum C++ Programming
    Replies: 2
    Last Post: 10-09-2001, 09:44 AM