# Trees

• 03-28-2004
samtay
Trees
Hi people,

How do I solve the following problem in C++ using trees ?

I need to determin the frequency and location of words in a document. After that list in alphabetical order the words and the reference to each line the word is used.

for example,

peter piper picked a peck of pickled peppers. a peck of pickled
peppers peter piper picked. if peter piper picked a peck of pickeled peppers, where is the peck of that peter piper picked.

it should display the folllowing information :

.
.
.
peppers 3:1, 2, 3
peter 4:1, 2,3
<word> <count>:<line number>

i need to do this using trees date structure.

pls assist.
• 03-28-2004
Salem
Code:

```class tree {   private:     tree *left; // sub-tree of words before this word     tree *right; // sub-tree of words after this word     int num_occurances; // how many times seen     string word;  // the word in question     vector <int> lines; // the lines it appears on   public: }```
• 03-28-2004
samtay
thanks salem,

but how to code the algorithm .... thats what i dunno.

Quote:

Originally Posted by Salem
Code:

```class tree {   private:     tree *left; // sub-tree of words before this word     tree *right; // sub-tree of words after this word     int num_occurances; // how many times seen     string word;  // the word in question     vector <int> lines; // the lines it appears on   public: }```

• 03-28-2004
Salem
You have 4 cases
node == NULL
-> add a new node here, copy the word and line number, and set the num_occurances to 1

node != NULL && word < node.word
-> recurse down the left branch

node != NULL && word > node.word
-> recurse down the right branch

node != NULL && word == node.word
-> add the line number to the list, and increment the counter
• 03-28-2004
samtay
pardon mefor my ignorance....how do you get the line number?
can you give me the actual coding?

and after inserting every word how to i display it in alphabetical order?

Quote:

Originally Posted by Salem
You have 4 cases
node == NULL
-> add a new node here, copy the word and line number, and set the num_occurances to 1

node != NULL && word < node.word
-> recurse down the left branch

node != NULL && word > node.word
-> recurse down the right branch

node != NULL && word == node.word
-> add the line number to the list, and increment the counter

• 03-28-2004
WebSnozz
Try a search on google for "tree traversal" frequency "discrete math"

One way you could use the strcture described and start at the root of the tree.
I would add a function called
addWord(const string & newword, int linenumber);

In this function you would start at the root of the tree
you need to check for 3 conditions, either
if newword==treenode.word, then num_occurences+=1; lines.PushBack(linenumber); return;

if newword<treenode.word, then traverse to the left node
if newword>treenode.word, then traverse to the right node

You would continue checking and traversing until you either find the word already in the tree OR
If you get to a "leaf"(a leaf is a term used to describe a node in a tree that had no left or right subtrees, in other words the pointers left and right are null) and don't find the word in the tree, then that means you need to add a new node for that word. You would do a lessthan greaterthan check against the current node to see if you need to add the new node to the right or left.

Once you have processed all the words in your file, then it shouldn't be a hard task to output all the data in your tree one node at a time.

You need to read up on tree traversal.

If you aren't familier with linked list, then it is important you understand that concept first and how to implement them. This will give you the skills you need to deal with pointers and nodes in a tree.

Good luck!
• 03-28-2004
samtay
can someone show me the code for retrieving line number ... can show me the whole segment of codes on how to read in the text?

and also to print out in alphabetical order

Quote:

Originally Posted by WebSnozz
Try a search on google for "tree traversal" frequency "discrete math"

One way you could use the strcture described and start at the root of the tree.
I would add a function called
addWord(const string & newword, int linenumber);

In this function you would start at the root of the tree
you need to check for 3 conditions, either
if newword==treenode.word, then num_occurences+=1; lines.PushBack(linenumber); return;

if newword<treenode.word, then traverse to the left node
if newword>treenode.word, then traverse to the right node

You would continue checking and traversing until you either find the word already in the tree OR
If you get to a "leaf"(a leaf is a term used to describe a node in a tree that had no left or right subtrees, in other words the pointers left and right are null) and don't find the word in the tree, then that means you need to add a new node for that word. You would do a lessthan greaterthan check against the current node to see if you need to add the new node to the right or left.

Once you have processed all the words in your file, then it shouldn't be a hard task to output all the data in your tree one node at a time.

You need to read up on tree traversal.

If you aren't familier with linked list, then it is important you understand that concept first and how to implement them. This will give you the skills you need to deal with pointers and nodes in a tree.

Good luck!

• 03-28-2004
Salem
> can someone show me the code for retrieving line number
You count them as you read them
Code:

```while ( file.getline(string) ) {   linenum++;   // now split string into words, and add each one }```
> and also to print out in alphabetical order
The tree makes in in alphabetical order, that's what the left/right decisions you made previously ensure.
It's just another recursive function based on
- print left tree
- print current node
- print right tree