Problem: Counting words in a file
Given a file, we can ask various how many different words there are in it:
* How many words are there all together?
* How many different words are there?
* How many times does each word appear?
* What lines do the words appear on?
Each of these questions takes a little more work than the one before. We are going to answer the last two questions. As discussed below, for full credit you will have to keep track of the lines that the words appear on. However, you will get most of the credit for the assignment if you just keep track of how often the word occurs (and this is much easier).
Details
* Call a function to identify yourself. This makes life easier for the grader. Name, utopia account, which homework this is, the course and lecture section, and finally the semester.
* Ask the user for the file to read. If the file won't open, then ask again (and again...).
* Keep track of whatever you need to know about a word, such as what lines it shows up on, in a struct called Word.
o Ideally we would like words to match up with our idea of a word, e.g. no "inappropriate" punctuation, but that makes things harder. To keep this simple, you may assume that a word is anything "delimited by whitespace". That's convenient since that's what you get using the input operator on strings.
o It would also be a little harder if you had to worry about case (i.e. uppercase vs. lowercase). Ignore case and therefore assume that "The" is different from "the".
* Display the most frequent word, its frequency and the list of line numbers that it appears on.
o If a word appears more than once on a line, list it multiple times.
Example
The output for the file Jabberwocky.txt should look like this:
Input file name? Jabberwock.txt
"the" occurs 14 times.
Lines: 1, 2, 3, 4, 6, 8, 12, 13, 18, 26, 31, 32, 33, 34.
Or this, if you are ignoring case:
Input file name? Jabberwock.txt
"the" occurs 18 times.
Lines: 1, 2, 3, 4, 6, 7, 7, 8, 9, 12, 13, 17, 18, 26, 31, 32, 33, 34.
I expect you to produce exactly the same output. (Plus the information identifying who you are.)
Program Design
Good program design separates parts of the program into functions. Here I will provide you with main():
int main() {
displayInfo(); // Display who I am
ifstream ifs; // to handle the input text file
getInputStream(ifs); // Ask the user for the filename and open it.
vector<Word> words; // for collecting information about the words
getWords(ifs, words); // Collect all the information
ifs.close(); // Good habit to close files when you are done with them.
displayWord( getMostFrequent(words) ); // Display most frequent word's info
}
We won't generally give you this much detail on how to organize your programs, but for the first assignment we want to make sure you understand exactly what we're looking for.
I can't think of any reason why you should need to modify this version of main(). If you think that you need to, I suggest that you speak with me first.
Program "Layout"
Neatness counts. So does organization. This program (and all your programs) should conform to the following layout:
1. At the top of the file, always place a comment identifying
* The name of the file
* Who you are
* Your email
* The course and section
* The semester
2. Includes.
* Only specify includes that you actually use.
* Follow this by any using directives or declarations such as "using namespace std;"
3. Type definitions (e.g. structs).
4. Function prototypes
5. main
6. Other function definitions
Comments
Note that many of the points that students lose on homework are for lack of comments.
All your programs should be well commented. At the very least
* every file must (as noted above) have a comment at the beginning, stating
o the name of the file,
o the name and utopia account of the programmer
o the course, lecture section and semester.
* every function should have a comment to state what its parameters are, what it does, and what if anything it returns.
o It is somewhat a matter of style when writing your program in a single file whether the comment should go on the prototype or the function definition. Personally, in this case I prefer to put it with the definitions. But later when we use header files, you will want brief documentation with the prototype. For this assignment, I leave it to your taste. But be sure to have a good comment.
* every loop should have a comment describing its purpose.
* every variable should have a comment to state its purpose. (Variables used as for-loop iterators are usually described adeqauately by the comment explaining the loop.)
In addition any "tricky" code should be commented if its purpose isn't immediately obvious. [We may ask you to explain parts of your program to us. If the code isn't commented and you have any difficulty immediately explaining the code, then it wasn't sufficiently commented.]
Credit
Your program must compile and run to get credit. Still, it is better to turn in something than nothing. At least I will know that you tried. Late homeworks will be lose 10% per day. After five days they will only be accepted at the discretion of the instructor and will receive a maximum of 50%.
Partial Credit
If you can't figure out how to keep track of line numbers, then leave out that part of the problem. This will not earn you full credit, but it will be a lot better than something that doesn't work. To keep track of line numbers (without a lot of work), you will want to use "string streams", which were not covered in CS1114. They are easy to use (just look at the example in the linked example), but it is something new. To do the homework without line numbers is an easy CS1114 level assignment and might even be a good first pass for getting this assignment done. (Note that it is a common observation in industry that your first attempt at a project is often thrown away.)
here is what the file should contain..........Jabberwock.txt
Twas brillig and the slithy toves
Did gyre and gimble in the wabe
All mimsy were the borogroves
And the mome rathes outgrabe
Beware the Jabberwock my son
The jaws that bite The claws that catch
Beware the JubJub bird and shun
The frumious Bandersnatch
He took his vorpal sword in hand
Long time the manxsome foe he sought
So rested he by the Tumtum tree
And stood awhile in thought
And as in uffish thought he stood
The Jabberwock with eyes of flame
Came whiffling through the tulgey wood
And burbled as it came
One two One two and through and through
His vorpal blade went snicker snack
He left it dead and with its head
He went gallumphing back
And hast thou slain the Jabberwock
Come to my arms my beamish boy
Oh frabjous day Callooh Callay
He chortled in his joy
Twas brillig and the slithy toves
Did gyre and gimble in the wabe
All mimsy were the borogroves
And the mome rathes outgrabe