# word count problem

Show 80 post(s) from this thread on one page
Page 3 of 3 First 123
• 06-18-2010
laserlight
Quote:

Originally Posted by MK27
obviously you would only need a single map, not a vector + map there

If we really wanted to have a vector of words and a corresponding vector of counts, it would still be more time efficient to make use of a map or unordered_map to do the counting, and then transfer the result to the vectors in a single pass.

An efficient solution without using a map or unordered_map would be to read in the entire list of words into a vector, sort it, and then count the words in a single pass... but this may not be very space efficient.
• 06-18-2010
MK27
Binary tree!

Altho that's what std::map is, I believe (an RB tree).

Quote:

Originally Posted by laserlight
If we really wanted to have a vector of words and a corresponding vector of counts, it would still be more time efficient to make use of a map or unordered_map to do the counting,

I honestly don't think so, take a close look at how the count happens (this is why I said it had a certain elegance).

Code:

```          words_count [place] +=1 ;                   [...or...]             words.push_back (x) ;             words_count.push_back  (1) ;```
Presuming the iteration thru the vector would have to happen anyway (yielding place), this is very minimal.
• 06-18-2010
laserlight
Quote:

Originally Posted by MK27
Binary tree!

Altho that's what std::map is, I believe (an RB tree).

Yes.

Quote:

Originally Posted by MK27
I honestly don't think so, take a close look at how the count happens (this is why I said it had a certain elegance).
(...)
Presuming the iteration thru the vector would have to happen anyway (yielding place), this is very minimal.

No, it is still O(n^2) time versus O(n log n) time. You must take a close look at the loop, not just highlighting the parts that look nice.
• 06-18-2010
MK27
Quote:

Originally Posted by laserlight
No, it is still O(n^2) time

The part to do with the word vector is, the count vector is O(1) because it uses the same value:

Code:

```        for (teller =0; teller < size; teller ++) // this will result in O(n^2)         {             if (x == words [teller])             {                 found = 1 ;                 place = teller ;             }         }         if (found == 1)         {             found = 0 ; // this is O(1)             words_count [place] +=1 ;         }```
• 06-18-2010
laserlight
Quote:

Originally Posted by MK27
The part to do with the word vector is, the count vector is O(1) because it uses the same value:

Yes, but that O(1) is dominated by the O(n^2). You cannot increment the count, or set it to 1, without first finding the word, or discovering that it is not in the list. If you use a map, the counting is done in O(n log n) time, then the transfer to the vectors is done in O(n) time, which is dominated by the O(n log n).
• 06-18-2010
roelof
Hello,

This is way to high for me now but very interresting.

Roelof
• 06-18-2010
MK27
Quote:

Originally Posted by laserlight
Yes, but that O(1) is dominated by the O(n^2). You cannot increment the count, or set it to 1, without first finding the word, or discovering that it is not in the list. If you use a map, the counting is done in O(n log n) time, then the transfer to the vectors is done in O(n) time, which is dominated by the O(n log n).