Problem: Counting words in a file

Given a file, we can ask various how many different words there are in it:

* How many words are there all together?
* How many different words are there?
* How many times does each word appear?
* What lines do the words appear on?

Each of these questions takes a little more work than the one before. We are going to answer the last two questions. As discussed below, for full credit you will have to keep track of the lines that the words appear on. However, you will get most of the credit for the assignment if you just keep track of how often the word occurs (and this is much easier).
Details

* Call a function to identify yourself. This makes life easier for the grader. Name, utopia account, which homework this is, the course and lecture section, and finally the semester.
* Ask the user for the file to read. If the file won't open, then ask again (and again...).
* Keep track of whatever you need to know about a word, such as what lines it shows up on, in a struct called Word.
o Ideally we would like words to match up with our idea of a word, e.g. no "inappropriate" punctuation, but that makes things harder. To keep this simple, you may assume that a word is anything "delimited by whitespace". That's convenient since that's what you get using the input operator on strings.
o It would also be a little harder if you had to worry about case (i.e. uppercase vs. lowercase). Ignore case and therefore assume that "The" is different from "the".
* Display the most frequent word, its frequency and the list of line numbers that it appears on.
o If a word appears more than once on a line, list it multiple times.

Example

The output for the file Jabberwocky.txt should look like this:

Input file name? Jabberwock.txt
"the" occurs 14 times.
Lines: 1, 2, 3, 4, 6, 8, 12, 13, 18, 26, 31, 32, 33, 34.

Or this, if you are ignoring case:

Input file name? Jabberwock.txt
"the" occurs 18 times.
Lines: 1, 2, 3, 4, 6, 7, 7, 8, 9, 12, 13, 17, 18, 26, 31, 32, 33, 34.

I expect you to produce exactly the same output. (Plus the information identifying who you are.)
Program Design

Good program design separates parts of the program into functions. Here I will provide you with main():

int main() {
displayInfo(); // Display who I am
ifstream ifs; // to handle the input text file
getInputStream(ifs); // Ask the user for the filename and open it.
vector<Word> words; // for collecting information about the words
getWords(ifs, words); // Collect all the information
ifs.close(); // Good habit to close files when you are done with them.
displayWord( getMostFrequent(words) ); // Display most frequent word's info
}

We won't generally give you this much detail on how to organize your programs, but for the first assignment we want to make sure you understand exactly what we're looking for.

I can't think of any reason why you should need to modify this version of main(). If you think that you need to, I suggest that you speak with me first.
Program "Layout"

Neatness counts. So does organization. This program (and all your programs) should conform to the following layout:

1. At the top of the file, always place a comment identifying
* The name of the file
* Who you are
* The course and section
* The semester
2. Includes.
* Only specify includes that you actually use.
* Follow this by any using directives or declarations such as "using namespace std;"
3. Type definitions (e.g. structs).
4. Function prototypes
5. main
6. Other function definitions

Note that many of the points that students lose on homework are for lack of comments.

All your programs should be well commented. At the very least

* every file must (as noted above) have a comment at the beginning, stating
o the name of the file,
o the name and utopia account of the programmer
o the course, lecture section and semester.
* every function should have a comment to state what its parameters are, what it does, and what if anything it returns.
o It is somewhat a matter of style when writing your program in a single file whether the comment should go on the prototype or the function definition. Personally, in this case I prefer to put it with the definitions. But later when we use header files, you will want brief documentation with the prototype. For this assignment, I leave it to your taste. But be sure to have a good comment.
* every loop should have a comment describing its purpose.
* every variable should have a comment to state its purpose. (Variables used as for-loop iterators are usually described adeqauately by the comment explaining the loop.)

In addition any "tricky" code should be commented if its purpose isn't immediately obvious. [We may ask you to explain parts of your program to us. If the code isn't commented and you have any difficulty immediately explaining the code, then it wasn't sufficiently commented.]
Credit

Your program must compile and run to get credit. Still, it is better to turn in something than nothing. At least I will know that you tried. Late homeworks will be lose 10% per day. After five days they will only be accepted at the discretion of the instructor and will receive a maximum of 50%.
Partial Credit

If you can't figure out how to keep track of line numbers, then leave out that part of the problem. This will not earn you full credit, but it will be a lot better than something that doesn't work. To keep track of line numbers (without a lot of work), you will want to use "string streams", which were not covered in CS1114. They are easy to use (just look at the example in the linked example), but it is something new. To do the homework without line numbers is an easy CS1114 level assignment and might even be a good first pass for getting this assignment done. (Note that it is a common observation in industry that your first attempt at a project is often thrown away.)

here is what the file should contain..........Jabberwock.txt

Twas brillig and the slithy toves
Did gyre and gimble in the wabe
All mimsy were the borogroves
And the mome rathes outgrabe

Beware the Jabberwock my son
The jaws that bite The claws that catch
Beware the JubJub bird and shun
The frumious Bandersnatch

He took his vorpal sword in hand
Long time the manxsome foe he sought
So rested he by the Tumtum tree
And stood awhile in thought

And as in uffish thought he stood
The Jabberwock with eyes of flame
Came whiffling through the tulgey wood
And burbled as it came

One two One two and through and through
His vorpal blade went snicker snack
He went gallumphing back

And hast thou slain the Jabberwock
Come to my arms my beamish boy
Oh frabjous day Callooh Callay
He chortled in his joy

Twas brillig and the slithy toves
Did gyre and gimble in the wabe
All mimsy were the borogroves
And the mome rathes outgrabe

2. Post your attempts... without a fair amount of code your book will be a better resouce than us. Since we refuse to do homework for people.

Nope.

4. The assignment gives you steps to follow. Do each step one at a time, compiling and testing each step as you go (and sometimes in between steps). Post code and ask specific questions if you are stuck.

5. Don’t panic!

It looks like this isn’t your first programming class. (If it is your first C/C++ class, then it’s OK to panic! ) Was any of this covered in your previous class, or in the lecture or current reading assignment?

- You should know how to ask the user for a filename, and open the file. You should know how to try again if the user enters an invalid filename.

- You should know how to read-in the file, perhaps one line at a time.

- You should know how to look for spaces (or other whitespace and non-alpha characters) and break the string into words. (He says you should ignore punctuation, although there is no punctuation in the document.)

- You should know how to compare words, and find matching and non-matching words.

- You should know how to make a list of words.

- You should know how to make a loop, counting the number of matches for each word.

Your program must compile and run to get credit. Still, it is better to turn in something than nothing. At least I will know that you tried.
Make sure you turn in something. This is always the case when your assignment consists of one "problem" or one question (or a few problems/questions). There will be times when you can’t get your program 100% working, and you will get some credit (maybe 95% sometimes), even if you don’t meet all of the requirements.

Don't try to write the whole program at once. Start with a small program that compiles and does something. Then, add "features" to meet the assignment requirements one at a time.