Thread: Search Engine - Binary Search Tree

  1. #1
    Registered User
    Join Date
    Mar 2005
    Posts
    5

    Search Engine - Binary Search Tree

    I'm trying to make a small search engine using a binary search tree. The program reads in descriptions of books, stores each new book in a linked list, and each word of the description is stored in a binary search tree of that linked list. The input file I'm using is called "Book.txt" and looks like this:

    4
    Brock Read
    Virus_course
    The university teaches the course only to help those students
    who want to learn to fight viruses Mr Barker insists pointing
    to a growing movement in computer security that considers knowing
    how a virus is constructed to be an essential step in halting
    their spread The longer the industry ignores programs like
    Calgarys he argues the more structural and financial damage the
    virus will cause DOCUMENT_OVER

    Aquinas network
    Virus_Protection
    Hoax virus messages are spread by well meaning people who
    have heard about a virus usually contained in an email file
    that will do all sorts of damage when a certain email message
    is opened They want to inform their friends about it so they
    email everyone they can think of to warn them There are no
    viruses that are started when you open e-mail It is possible
    to get an email that has an infected attachment The attachment
    would contain either a program or a macro file that is infected
    but you would not get a virus even from opening the attachment DOCUMENT_OVER

    Richard Dawkins
    The_Selfish_Gene
    This book analyzes natural selection for a different
    point of view than the usual view It describes genes
    as being selfish in the gene pool and how thinking about
    genes as attempting to survive in many copies in the gene
    pool can do a better job of explaining the behavior of
    species than the usual selfish individual or group model
    can DOCUMENT_OVER

    Ekta Bowers
    Virus
    Virus spreads from one computer to another over networked
    lines virus writers exploit known bugs in installed software
    anti virus firms help us to get rid of virus DOCUMENT_OVER
    The first number tells how many different books are in the file, then authors name, then title of the book, followed by the description. DOCUMENT_OVER designates the end of a description. My code uses strncmp() to compare the word that was read in from the file to a string that contains "DOCUMENT_OVER" to end the book description, but for some reason DOCUMENT_OVER is being read into my BST and it shouldn't be... Then when I print the tree out it only prints "DOCUMENT_OVER" 5 times... I'm very new to C so ANY help or suggestions that you think may help me in the future would be much appreciated! Thanks!

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <malloc.h>
    
    struct treenode {
    	char *word;
    	int occurence;
    	struct treenode *left;
    	struct treenode	*right;
    }node;
    
    struct summary{
    	char *lname;
    	char *fname;
    	char *title;
    	struct treenode node;
    }info;
    
    struct ll{
    	struct summary info;
    	struct ll *next;
    }*List;
    
    int lladd(struct treenode *myroot, char fname[30], char lname[30], char title[30]);
    struct treenode* insert(struct treenode *root, struct treenode *temp);
    struct treenode* create_node(char val[30]);
    void inorder(struct treenode * current_ptr);
    
    int main(void){
    	int action=0, entries=0, words=0, i=0, j=0, str = 0, x=0, m=0;
    	char over[30] = {'D','O','C','U','M','E','N','T','_','O','V','E','R'};
    	char **query, filename[30], word[30], fname[30], lname[30], title[30];
    	FILE *fp;
    	struct treenode *temp= NULL, *root = NULL;
    
    
    	printf("Which action would you like to perfrom?\n"
    		" 1) Loading a list of books into the database.\n"
    		" 2) Handling queries to the database.\n"
    		" 3) Quit the program.\n");
    	
    	scanf("%d", &action);
    	while(action < 3){
    		if(action == 1){
    			printf("\nEnter the name of your file: ");
    			scanf("%s", filename);
    			fp = fopen(filename, "r");
    			
    			if(fp == NULL){ 
    				printf("\n\nSorry, that is an invalid file\n");
    				return 0;
    			}
    			
    			fscanf(fp, "%d", &entries);
    			printf("%d\n", entries);
    			while(x <= entries){
    				fscanf(fp, "%s", fname);
    				fscanf(fp, "%s", lname);
    				fscanf(fp, "%s", title);
    				while(strncmp(over, word, 13) != 0){
    					fscanf(fp, "%s", word);
    					if(strncmp(over, word, 13) != 0){
    						i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						temp = create_node(word);
    						root = insert(root, temp);
    					}
    				}
    				inorder(root);
    				lladd(root, fname, lname, title);
    				x++;
    			}
    		}
    		if(action ==2){
    			printf("\nHow many words are in your query?\n");
    			scanf("%d", &words);
    			printf("\nWhich words would you like to search for?\n");
    			query = (char **)malloc(words * sizeof(char *)+1);
    			for(i=0; i < words; i++){
    				j=0;
    				query[i] = (char *)malloc(sizeof(char));
    				printf("%d). ", i+1);
    				scanf("%s", query[i]);
    				str = strlen(query[i]);
    				while(j< str){
    					query[i][j] = tolower(query[i][j]);
    					j++;
    				}
    				printf("%s\n", query[i]);  //just to test if they were read in...
    				if (find(root, query[i]) != 0)
    				printf("\n\nWORD FOUND");
    			}
    		}
    		//if (action ==3) program terminates
    		printf("Which action would you like to perfrom?\n"
    			" 1) Loading a list of books into the database.\n"
    			" 2) Handling queries to the database.\n"
    			" 3) Quit the program.\n");
    		scanf("%d", &action);
    	}
    	return 0;
    }
    
    struct treenode* insert(struct treenode *root, struct treenode *temp){
    	// element should be inserted to the right.
    	
    	// Inserting into an empty tree.
    	if (root == NULL){
    		root = (struct treenode*)malloc(sizeof(struct treenode));
    		strcpy(root->word, temp->word);
    		root->left = NULL;
    		root->right = NULL;
    		root->occurence = 1;
    		return root;
    	}
    	else{
    		if(strcmp(root->word, temp->word) == 0){
    			root->occurence++;	
    			return root;
    		}
    		// element should be inserted to the left.
    		
    		// There is a right subtree to insert the node.
    		else if(strcmp(root->word, temp->word) > 0){
    			if (root->right != NULL){
    				root->right = insert(root->right, temp);
    			}
    			
    			// Place the node directly to the right of root.
    			else{
    				root->right = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->right->word, temp->word);
    				root->right->left = NULL;
    				root->right->right = NULL;
    				root->right->occurence = 1;
    				return root;
    			}
    		}
    		else if(strcmp(root->word, temp->word) < 0){
    			
    			// There is a left subtree to insert the node.
    			if (root->left != NULL){
    				root->left = insert(root->left, temp);
    			}
    			// Place the node directly to the left of root.
    			else{
    				root->left = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->left->word, temp->word);
    				root->left->left = NULL;
    				root->left->right = NULL;
    				root->left->occurence = 1;
    				return root;
    			}
    		}
    	}
    }
    
    int find(struct treenode *current_ptr, char val[]) {
    	// Check if there are nodes in the tree.
    	if (current_ptr != NULL) {
    		// Found the value at the root.
    		if (strcmp(current_ptr->word, val) == 0){
    			printf("\n\n%s", current_ptr->word);
    			return current_ptr->occurence;
    		}
    		// Search to the left.
    		if (strcmp(current_ptr->word, val) > 0) {
    			return find(current_ptr->left, val);
    		}
    		// Or...search to the right.
    		if(strcmp(current_ptr->word, val) < 0) {
    			return find(current_ptr->right, val);
    		}
    	}
    	else
    		return 0;
    }
    
    int lladd(struct treenode *myroot, char fname[30], char lname[30], char title[30])
    {
    	int i=0;
    	struct ll * pNew;
    	pNew = (struct ll *) (malloc(sizeof(struct ll)));
    
    	pNew->info.fname =  fname;
    	pNew->info.lname = lname;
    	pNew->info.title = title;
    	pNew->info.node = *myroot;
    	pNew->next = NULL;
    	
    	if(List == NULL)
    		List  =  pNew;
    	else
    	{
    		pNew->next  =  List;
    		List  =  pNew ;
    	}
    	return 1;
    }
    
    struct treenode * create_node(char val[30]){
    	// Allocate space for the node, set the fields.
    	struct treenode * temp, *new;
    	temp = (struct treenode*)malloc(sizeof(struct treenode));
    	temp->word = val;
    	temp->left = NULL;
    	temp->right = NULL;
    	new = temp;
    	free(temp);
    	return new; 	// Return a pointer to the created node.
    }
    
    void inorder(struct treenode * current_ptr) {
    
      // Only traverse the node if it's not null.
      if (current_ptr != NULL) {
        inorder(current_ptr->left); // Go Left.
        printf("%s \n", current_ptr->word); // Print the root.
        inorder(current_ptr->right); // Go Right.
      }
    }

  2. #2
    Registered User
    Join Date
    Jun 2003
    Posts
    361
    One thing I noticed, you initialise x equal to 0

    Then your while loop is:
    while(x <= entries)

    You want to change this to:
    while(x < entries)

    The reason is that if there are 4 books (entries = 4):
    x = 0 : the first time through the loop, reads first book
    x = 1 : second time through, reads second
    x = 2 : reads third
    x = 3 : reads fourth (this is how many books we have)
    x = 4 : not quite sure what's being read here, but it's not a book

    I would recommend reading in each line with an fgets() call, then "tokenizing" it (seperating the line word by word, based on a token of your choice. I.e. if your token was " ", then you could take the sentence "Here are my words" and convert it into individual words). A better explanation is here:
    http://faq.cprogramming.com/cgi-bin/...&id=1044780608
    Pentium 4 - 2.0GHz, 512MB RAM
    NVIDIA GeForce4 MX 440
    WinXP
    Visual Studio .Net 2003
    DX9 October 2004 Update (R.I.P. VC++ 6.0 Compatability)

  3. #3
    Registered User
    Join Date
    Jun 2003
    Posts
    361
    Code:
    struct treenode * create_node(char val[30]){
    	// Allocate space for the node, set the fields.
    	struct treenode * temp, *new;
    	temp = (struct treenode*)malloc(sizeof(struct treenode));
    	temp->word = val;
    	temp->left = NULL;
    	temp->right = NULL;
    	new = temp;
    	free(temp);
    	return new; 	// Return a pointer to the created node.
    }
    struct treenode is the same thing as node if you want to save yourself space whilst typing, you can use that instead, but it's not the cause of this problem...

    I'm gonna talk this through line by line:

    struct treenode * temp, *new;
    Create two pointers to treenode structures, right now they're not pointing at anything. Just as a quick note, "new" is a reserved word in C++, so since this is C, I guess it's okay. But if you move to C++, that's just asking for trouble

    temp = (struct treenode*)malloc(sizeof(struct treenode));
    Set aside memory, the size of a treenode structure. temp is now pointing at that memory.

    temp->word = val;
    You must use strcpy() to set one string equal to another, C isn't a big fan of the equals sign in these scenarios.


    temp->left = NULL;
    temp->right = NULL;
    Nothing special here, just setting NULL values

    new = temp;
    new now points at the same memory that temp is pointing at.

    free(temp);
    Clean up the memory that temp is pointing at. In turn, this is also cleaning up the memory that new is pointing at.

    return new; // Return a pointer to the created node.
    We're returning a pointer to invalid memory now

    I would recommend something more along the lines of:
    Code:
    node * create_node(char val[30]){
    	// Allocate space for the node, set the fields.
    	node * temp;
    	temp = (node*)malloc(sizeof(node));
    	temp->word = val;
    	temp->left = NULL;
    	temp->right = NULL;
    	return temp; 	// Return a pointer to the created node.
    }
    Or you can combine your insert() and create_node() function:
    Code:
    void insert(node** root, char val[])
    {	
    	//Is our current node empty? If so, insert here
    	if ((*root) == NULL){
    		(*root) = (node*)malloc(sizeof(struct node));
    		strcpy((*root)->word, val);
    		(*root)->left = NULL;
    		(*root)->right = NULL;
    		(*root)->occurence = 1;
    	}
    	else
    	{
    		if(strcmp((*root)->word, val) == 0)	//Is our current node the same as our insertion?
    		{
    			(*root)->occurence++;	//Increase occurences	
    		}
    
    		else if(strcmp((*root)->word, val) > 0)	//Do we insert to the right?
    		{
    			insert(&((*root)->right), val);	//Pass the right node and re-check if it's empty
    		}
    		else if(strcmp((*root)->word, val) < 0)	//Do we insert to the left? (This is our only option left, so hopefully)
    		{	
    			insert(&((*root)->left), val);	//Pass the left node and re-check if it's empty
    		}
    	}
    }
    Which would be called from your main function with a:
    Code:
    insert(&root, word);
    This has not been tested, so there may be some issues I missed. Take a look at it, make sure you understand what's happening before using it. If you have any more questions, just ask.

    If you get it working, and the errors are still coming up, then you know where to ask

    EDIT:
    As for calling free(), yes, you do need to do it, but it's best to do it at the end of your program. Create a function Cleanup() (or something like that). Then consider how you're visiting every node when you want to print your tree, and use that method, but instead of printing, free that node and move on.
    Last edited by Epo; 04-14-2005 at 09:21 PM.
    Pentium 4 - 2.0GHz, 512MB RAM
    NVIDIA GeForce4 MX 440
    WinXP
    Visual Studio .Net 2003
    DX9 October 2004 Update (R.I.P. VC++ 6.0 Compatability)

  4. #4
    Registered User
    Join Date
    Mar 2005
    Posts
    5
    Thanks so much for your help. I understood your comments and corrected my code accordingly. I also noticed i needed to change the:
    Code:
    struct treenode {
    	char *word;
    	int occurence;
    	struct treenode *left;
    	struct treenode *right;
    };
    to:
    Code:
    struct treenode {
    	char word[30];
    	int occurence;
    	struct treenode *left;
    	struct treenode *right;
    };
    and I was able to get rid of a segmentation fault error.

    and in my rush I accidently added code in the wrong spot of my while loop, it should look like this:

    Code:
    while(x < entries){
    				fscanf(fp, "%s", fname);
    				fscanf(fp, "%s", lname);
    				fscanf(fp, "%s", title);
    				while(strncmp(over, word, 13) != 0){
    					fscanf(fp, "%s", word);
    					if(strncmp(over, word, 13) != 0){
    						i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						temp = create_node(word);
    						root = insert(root, temp);
    					}
    				}
    			}
    			x++;
    			inorder(root);
    			lladd(root, fname, lname, title);
    		}
    and not:

    Code:
    while(x < entries){
    				fscanf(fp, "%s", fname);
    				fscanf(fp, "%s", lname);
    				fscanf(fp, "%s", title);
    				while(strncmp(over, word, 13) != 0){
    					fscanf(fp, "%s", word);
    					if(strncmp(over, word, 13) != 0){
    						i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						temp = create_node(word);
    						root = insert(root, temp);
    					}
    				}
    				inorder(root);
    				lladd(root, fname, lname, title);
    				x++;
    			}
    		}
    Oops...
    But now it seems to get lost in the while loops, and goes on forever... Please help

    My new edited code looks like this:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <malloc.h>
    
    struct treenode {
    	char word[30];
    	int occurence;
    	struct treenode *left;
    	struct treenode *right;
    };
    
    struct summary{
    	char *lname;
    	char *fname;
    	char *title;
    	struct treenode node;
    }info;
    
    struct ll{
    	struct summary info;
    	struct ll *next;
    }*List;
    
    int lladd(struct treenode *myroot, char fname[30], char lname[30], char title[30]);
    struct treenode* insert(struct treenode *root, struct treenode *temp);
    struct treenode* create_node(char val[30]);
    void inorder(struct treenode * current_ptr);
    
    int main(void){
    	int action=0, entries=0, words=0, i=0, j=0, str = 0, x=0, m=0;
    	char over[30] = {'D','O','C','U','M','E','N','T','_','O','V','E','R'};
    	char **query, filename[30], word[30], fname[30], lname[30], title[30];
    	FILE *fp;
    	struct treenode *temp= NULL, *root = NULL;
    
    
    	printf("Which action would you like to perfrom?\n"
    		" 1) Loading a list of books into the database.\n"
    		" 2) Handling queries to the database.\n"
    		" 3) Quit the program.\n");
    	
    	scanf("%d", &action);
    	while(action < 3){
    		if(action == 1){
    			printf("\nEnter the name of your file: ");
    			scanf("%s", filename);
    			fp = fopen(filename, "r");
    			
    			if(fp == NULL){ 
    				printf("\n\nSorry, that is an invalid file\n");
    				return 0;
    			}
    			
    			fscanf(fp, "%d", &entries);
    			printf("%d\n", entries);
    			while(x < entries){
    				fscanf(fp, "%s", fname);
    				fscanf(fp, "%s", lname);
    				fscanf(fp, "%s", title);
    				while(strncmp(over, word, 13) != 0){
    					fscanf(fp, "%s", word);
    					if(strncmp(over, word, 13) != 0){
    						i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						temp = create_node(word);
    						root = insert(root, temp);
    					}
    				}
    			}
    			x++;
    			inorder(root);
    			lladd(root, fname, lname, title);
    		}
    		if(action ==2){
    			printf("\nHow many words are in your query?\n");
    			scanf("%d", &words);
    			printf("\nWhich words would you like to search for?\n");
    			query = (char **)malloc(words * sizeof(char *)+1);
    			for(i=0; i < words; i++){
    				j=0;
    				query[i] = (char *)malloc(sizeof(char));
    				printf("%d). ", i+1);
    				scanf("%s", query[i]);
    				str = strlen(query[i]);
    				while(j< str){
    					query[i][j] = tolower(query[i][j]);
    					j++;
    				}
    				printf("%s\n", query[i]);  //just to test if they were read in...
    				if (find(root, query[i]) != 0)
    				printf("\n\nWORD FOUND");
    			}
    		}
    		//if (action ==3) program terminates
    		printf("Which action would you like to perfrom?\n"
    			" 1) Loading a list of books into the database.\n"
    			" 2) Handling queries to the database.\n"
    			" 3) Quit the program.\n");
    		scanf("%d", &action);
    	}
    	return 0;
    }
    
    struct treenode* insert(struct treenode *root, struct treenode *temp){
    	// element should be inserted to the right.
    	
    	// Inserting into an empty tree.
    	if (root == NULL){
    		root = (struct treenode*)malloc(sizeof(struct treenode));
    		strcpy(root->word, temp->word);
    		root->left = NULL;
    		root->right = NULL;
    		root->occurence = 1;
    		return root;
    	}
    	else{
    		if(strcmp(root->word, temp->word) == 0){
    			root->occurence++;	
    			return root;
    		}
    		// element should be inserted to the left.
    		
    		// There is a right subtree to insert the node.
    		else if(strcmp(root->word, temp->word) > 0){
    			if (root->right != NULL){
    				root->right = insert(root->right, temp);
    			}
    			
    			// Place the node directly to the right of root.
    			else{
    				root->right = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->right->word, temp->word);
    				root->right->left = NULL;
    				root->right->right = NULL;
    				root->right->occurence = 1;
    				return root;
    			}
    		}
    		else if(strcmp(root->word, temp->word) < 0){
    			
    			// There is a left subtree to insert the node.
    			if (root->left != NULL){
    				root->left = insert(root->left, temp);
    			}
    			// Place the node directly to the left of root.
    			else{
    				root->left = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->left->word, temp->word);
    				root->left->left = NULL;
    				root->left->right = NULL;
    				root->left->occurence = 1;
    				return root;
    			}
    		}
    	}
    }
    
    int find(struct treenode *current_ptr, char val[]) {
    	// Check if there are nodes in the tree.
    	if (current_ptr != NULL) {
    		// Found the value at the root.
    		if (strcmp(current_ptr->word, val) == 0){
    			printf("\n\n%s", current_ptr->word);
    			return current_ptr->occurence;
    		}
    		// Search to the left.
    		if (strcmp(current_ptr->word, val) > 0) {
    			return find(current_ptr->left, val);
    		}
    		// Or...search to the right.
    		if(strcmp(current_ptr->word, val) < 0) {
    			return find(current_ptr->right, val);
    		}
    	}
    	else
    		return 0;
    }
    
    int lladd(struct treenode *myroot, char fname[30], char lname[30], char title[30])
    {
    	int i=0;
    	struct ll * pNew;
    	pNew = (struct ll *) (malloc(sizeof(struct ll)));
    
    	pNew->info.fname =  fname;
    	pNew->info.lname = lname;
    	pNew->info.title = title;
    	pNew->info.node = *myroot;
    	pNew->next = NULL;
    	
    	if(List == NULL)
    		List  =  pNew;
    	else
    	{
    		pNew->next  =  List;
    		List  =  pNew ;
    	}
    	return 1;
    }
    
    struct treenode * create_node(char val[30]){
    	// Allocate space for the node, set the fields.
    	struct treenode * temp;
    	temp = (struct treenode*)malloc(sizeof(struct treenode));
    	strcpy(temp->word,val);
    	temp->left = NULL;
    	temp->right = NULL;
    	return temp; 	// Return a pointer to the created node.
    }
    
    
    void inorder(struct treenode * current_ptr) {
    
      // Only traverse the node if it's not null.
      if (current_ptr != NULL) {
        inorder(current_ptr->left); // Go Left.
        printf("%s \n", current_ptr->word); // Print the root.
        inorder(current_ptr->right); // Go Right.
      }
    }
    Thanks again for ANY and all help!
    Last edited by Gecko2099; 04-14-2005 at 11:11 PM.

  5. #5
    Registered User
    Join Date
    Jun 2003
    Posts
    361
    Okie dokes, here's what I've found so far

    Your find() function isn't prototyped. Just a small oversight, but my compiler yelled at me when I tried to run it so thought I'd mention it.

    Also, several of your functions don't have a return path (although it looks like they do). Adding a return 0, or return NULL to the end of them (namely insert(), and just get rid of the else condition in find(), I think that was all of them) will fix this.

    I edited your create_node() function to this:
    Code:
    struct treenode * create_node(char val[]){
    	// Allocate space for the node, set the fields.
    	struct treenode * temp;
    	temp = (struct treenode*)malloc(sizeof(struct treenode));
    	printf("Before");
    	printf("%s", val);
    	printf("BLAH");
    	strcpy(temp->word,val);
    	printf("After");
    	temp->left = NULL;
    	temp->right = NULL;
    	return temp; 	// Return a pointer to the created node.
    }
    All I added were the printf() calls, and it seems (by the output below) that it's the strcpy() call that's failing...
    Which action would you like to perfrom?
    1) Loading a list of books into the database.
    2) Handling queries to the database.
    3) Quit the program.
    1

    Enter the name of your file: c:\a.txt
    4
    BeforetheBLAHPress any key to continue
    It all boils down to your treenode struct:
    Code:
    struct treenode {
    	char *word;
    	int occurence;
    	struct treenode *left;
    	struct treenode *right;
    };
    I changed it to:
    Code:
    struct treenode {
    	char word[30];
    	int occurence;
    	struct treenode *left;
    	struct treenode *right;
    };
    So that word actually has a set amount of memory put aside for it.

    char* word and char word[] are two different things. Our prof tried to tell us that they're both just pointers to characters, but that's actually not the case. They are different, and in this case, strcpy needs to have that space defined. This removed any seg faults in the code I ran. The output seemed a touch random, but I don't know exactly what it's supposed to look like, so that's something you'll have to decide on.

    Hope that helps you on your way.
    Pentium 4 - 2.0GHz, 512MB RAM
    NVIDIA GeForce4 MX 440
    WinXP
    Visual Studio .Net 2003
    DX9 October 2004 Update (R.I.P. VC++ 6.0 Compatability)

  6. #6
    Registered User
    Join Date
    Jun 2003
    Posts
    361
    I believe that the code you changed to:
    Code:
    while(x < entries){
    				fscanf(fp, "%s", fname);
    				fscanf(fp, "%s", lname);
    				fscanf(fp, "%s", title);
    				while(strncmp(over, word, 13) != 0){
    					fscanf(fp, "%s", word);
    					if(strncmp(over, word, 13) != 0){
    						i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						temp = create_node(word);
    						root = insert(root, temp);
    					}
    				}
    			}
    			x++;
    			inorder(root);
    			lladd(root, fname, lname, title);
    		}
    Is very close, however, now x isn't updating each time. So, you have to move the x++ up one set of brackets, i.e. :
    Code:
    			while(x < entries){
    				fscanf(fp, "%s", fname);
    				fscanf(fp, "%s", lname);
    				fscanf(fp, "%s", title);
    				while(strncmp(over, word, 13) != 0){
    					fscanf(fp, "%s", word);
    					if(strncmp(over, word, 13) != 0){
    						i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						temp = create_node(word);
    						root = insert(root, temp);
    					}
    				}
    				x++;
    			}
    
    			inorder(root);
    			lladd(root, fname, lname, title);
    Pentium 4 - 2.0GHz, 512MB RAM
    NVIDIA GeForce4 MX 440
    WinXP
    Visual Studio .Net 2003
    DX9 October 2004 Update (R.I.P. VC++ 6.0 Compatability)

  7. #7
    Registered User
    Join Date
    Jun 2003
    Posts
    361
    Add this prototype:
    int find(struct treenode *current_ptr, char val[]);

    To the top of your code (with your other prototypes). Again, another small thing, but my compiler's yelling, so I will too

    Code:
    struct treenode* insert(struct treenode *root, struct treenode *temp){
    	// element should be inserted to the right.
    	
    	// Inserting into an empty tree.
    	if (root == NULL){
    		root = (struct treenode*)malloc(sizeof(struct treenode));
    		strcpy(root->word, temp->word);
    		root->left = NULL;
    		root->right = NULL;
    		root->occurence = 1;
    		return root;
    	}
    	else{
    		if(strcmp(root->word, temp->word) == 0){
    			root->occurence++;	
    			return root;
    		}
    		// element should be inserted to the left.
    		
    		// There is a right subtree to insert the node.
    		else if(strcmp(root->word, temp->word) > 0){
    			if (root->right != NULL){
    				root->right = insert(root->right, temp);
    			}
    			
    			// Place the node directly to the right of root.
    			else{
    				root->right = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->right->word, temp->word);
    				root->right->left = NULL;
    				root->right->right = NULL;
    				root->right->occurence = 1;
    				return root;
    			}
    		}
    		else if(strcmp(root->word, temp->word) < 0){
    			
    			// There is a left subtree to insert the node.
    			if (root->left != NULL){
    				root->left = insert(root->left, temp);
    			}
    			// Place the node directly to the left of root.
    			else{
    				root->left = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->left->word, temp->word);
    				root->left->left = NULL;
    				root->left->right = NULL;
    				root->left->occurence = 1;
    				return root;
    			}
    		}
    	}
    	return root; //ADD THIS HERE
    }
    
    int find(struct treenode *current_ptr, char val[]) {
    	// Check if there are nodes in the tree.
    	if (current_ptr != NULL) {
    		// Found the value at the root.
    		if (strcmp(current_ptr->word, val) == 0){
    			printf("\n\n%s", current_ptr->word);
    			return current_ptr->occurence;
    		}
    		// Search to the left.
    		if (strcmp(current_ptr->word, val) > 0) {
    			return find(current_ptr->left, val);
    		}
    		// Or...search to the right.
    		if(strcmp(current_ptr->word, val) < 0) {
    			return find(current_ptr->right, val);
    		}
    	}
    	else
    		return 0;
    }
    Okay, so the problem was with how you were assigning your root. Remember, in a Binary Search Tree, your first element that's read in acts as your root, and the rest of your input gets sorted based on that first element. You had too many root calls, and what would happen is, say an element was inserted to the right, under certain conditions, you would pass that new element back as the root, and lose everything above AND everything on the other half of the tree (Basically, you would be starting your tree from scratch at that new node). Imagine creating a Binary Search Tree, and then you set Root->left->left->right->left->right as the NEW root. Everything not below that new root is now lost, which is what was happening. And why only the last few words were being printed.

    All the lines that I bolded, you can erase, and add that one return towards the end of insert. Now your root isn't going to change, and you will always be returning the very top node.
    Pentium 4 - 2.0GHz, 512MB RAM
    NVIDIA GeForce4 MX 440
    WinXP
    Visual Studio .Net 2003
    DX9 October 2004 Update (R.I.P. VC++ 6.0 Compatability)

  8. #8
    Registered User
    Join Date
    Jun 2004
    Posts
    93
    This is a very interesting program Gecko, especially for someone like me who is learning binary trees at this moment as well.

    Can you post your finished code here to look at it?

    Thanks.

  9. #9
    Registered User
    Join Date
    Mar 2005
    Posts
    5
    Well the program still doesn't quite work...
    It's not printing the title of the documents for some reason, I guess when I changed the code I messed that up as well..
    and my search function is stopping...

    I'll be submitting the code tonight (finished or unfinished) so if you have any suggestions please let me know. If I don't finish it in time for submission at some point I will hopefully complete it for my own educational purposes.

    Here's where I'm at right now:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <malloc.h>
    
    struct treenode {
    	char word[30];
    	int occurence;
    	struct treenode *left;
    	struct treenode *right;
    };
    
    struct sum{
    	char lname[30];
    	char fname[30];
    	char title[30];
    	struct treenode * node;
    	struct sum * next;
    };
    
    
    struct sum* sumadd(struct treenode *myroot, struct sum *current_list); //adds the binary tree to the info for the file
    struct treenode* insert(struct treenode *root, struct treenode *temp); //inserts the new word into the tree
    struct treenode* create_node(char val[30]);//creates the node that will be inserted
    void preorder(struct treenode * current_ptr);//prints the preorder traversal of the binary tree
    int find(struct treenode *current_ptr, char val[]);//finds the word that is being searched for and returns its occurence
    void clear(struct treenode * myroot);
    int hight(struct treenode *root);//finds the height of the binary tree
    
    int main(void){
    	int action=0, entries=0, words=0, score=0, tmpscore=0, height = 0;
    	int j=0, str = 0, x=0, m=0,i=0;
    	char over[30] = {'D','O','C','U','M','E','N','T','_','O','V','E','R'};
    	char **query, filename[30], word[30], fname[30], lname[30], title[30];
    	FILE *fp;
    	struct treenode *temp= NULL, *root = NULL;
    	struct sum * current_list, * temp_list;
    	current_list = (struct sum *)malloc(sizeof(struct sum));
    
    
    	printf("Which action would you like to perfrom?\n"
    		" 1) Loading a list of books into the database.\n"
    		" 2) Handling queries to the database.\n"
    		" 3) Quit the program.\n");
    	
    	scanf("%d", &action);
    	while(action <= 3){
    		if(action == 1){
    			printf("\nEnter the name of your file: ");
    			scanf("%s", filename);
    			fp = fopen(filename, "r");
    			
    			if(fp == NULL){ 
    				printf("\n\nSorry, that is an invalid file\n");
    				return 0;
    			}
    			fscanf(fp, "%d", &entries);
    			while(x < entries) {
    			words = 0;
                    fscanf(fp,"%s%s%s", current_list->fname, current_list->lname, current_list->title);
                    do {
                            fscanf(fp, "%s", word);
                            if(strcmp(word,"DOCUMENT_OVER") !=0){
    								i=0;
    						str = strlen(word);
    						while(i < str){
    							word[i] = tolower(word[i]);
    							i++;
    						}
    						words++;
    											temp = create_node(word);
                                   	root = insert(root, temp);
    										  }
                    }
                    while(strcmp(word, "DOCUMENT_OVER") != 0);
                    current_list = sumadd(root, current_list);
    					 printf("Title is:  %s\n", current_list->title);
    					 printf("Number of words: %d\n", words);
    					 printf("Height of the tree is: %d\n", hight(current_list->node));
    					 preorder(current_list->node);
    					 printf("\n\n\n");
    					 free(root);
                    x++;
            }
    		}
    		if(action ==2){
    			printf("\nHow many words are in your query?\n");
    			scanf("%d", &words);
    			printf("\nWhich words would you like to search for?\n");
    			query = (char **)malloc(words * sizeof(char *)+1);
    			for(i=0; i < words; i++){
    				j=0;
    				query[i] = (char *)malloc(sizeof(char));
    				printf("%d). ", i+1);
    				scanf("%s", query[i]);
    				str = strlen(query[i]);
    				while(j< str){
    					query[i][j] = tolower(query[i][j]);
    					j++;
    				}
    				}
    			i=0;
    			temp_list = current_list;
    			while(i < words){
    			//while the list isn't empty
    			while(temp_list != NULL){
    			//print the required info
    				printf("Title is: %s\n", temp_list->title);
    				printf("Last Name is: %s\n", temp_list->lname);
    				printf("Fist Name is: %s\n", temp_list->fname);
    				score = find(temp_list->node, query[i]);
    				if(score == 0){
    					printf("Sorry no matching word was found.\n\n");
    				}
    				else{
    					printf("Score = %d\n\n", score);
    					}
    				i++;
    				temp_list = temp_list->next;
    				}
    		}
    		}
    		if (action ==3){//program terminates
    			printf("GOOD BYE\n\n");
    			free(root);
    			return 0;
    		}
    		printf("Which action would you like to perfrom?\n"
    			" 1) Loading a list of books into the database.\n"
    			" 2) Handling queries to the database.\n"
    			" 3) Quit the program.\n");
    		scanf("%d", &action);
    	}
    	return 0;
    }
    
    struct treenode* insert(struct treenode *root, struct treenode *temp){
    	// element should be inserted to the right.
    	
    	// Inserting into an empty tree.
    	if (root == NULL){
    		root = (struct treenode*)malloc(sizeof(struct treenode));
    		strcpy(root->word, temp->word);
    		root->left = NULL;
    		root->right = NULL;
    		root->occurence = 1;
    	}
    	else{
    		if(strcmp(root->word, temp->word) == 0){
    			root->occurence++;	
    		}
    		
    		// There is a right subtree to insert the node.
    		else if(strcmp(root->word, temp->word) > 0){
    			if (root->right != NULL){
    				root->right = insert(root->right, temp);
    			}
    			
    			// Place the node directly to the right of root.
    			else{
    				root->right = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->right->word, temp->word);
    				root->right->left = NULL;
    				root->right->right = NULL;
    				root->right->occurence = 1;
    			}
    		}
    		else if(strcmp(root->word, temp->word) < 0){
    			// There is a left subtree to insert the node.
    			if (root->left != NULL){
    				root->left = insert(root->left, temp);
    			}
    			// Place the node directly to the left of root.
    			else{
    				root->left = (struct treenode*)malloc(sizeof(struct treenode));
    				strcpy(root->left->word, temp->word);
    				root->left->left = NULL;
    				root->left->right = NULL;
    				root->left->occurence = 1;
    			}
    		}
    	}
    	return root;
    }
    
    int find(struct treenode *current_ptr, char val[]) {
    	// Check if there are nodes in the tree.
    	if (current_ptr != NULL) {
    		// Found the value at the root.
    		if (strcmp(current_ptr->word, val) == 0){
    			return current_ptr->occurence;
    		}
    		// Search to the left.
    		if (strcmp(current_ptr->word, val) < 0) {
    			return find(current_ptr->left, val);
    		}
    		// Or...search to the right.
    		if(strcmp(current_ptr->word, val) > 0) {
    			return find(current_ptr->right, val);
    		}
    	}
    //	else
    		return 0;
    }
    
    struct sum* sumadd(struct treenode *myroot, struct sum * current_list)
    {
    	int i=0;
    	struct sum * pNew;
    	//allocate memory for the new document
    	pNew = (struct sum *) (malloc(sizeof(struct sum)));
    	//store the tree in the document
    	pNew->node = myroot;
    	pNew->next = NULL;
    	
    	//if its the first document
    	if(current_list == NULL)
    		current_list  =  pNew;
    	//add document to the existing list
    	else
    	{
    		pNew->next  =  current_list;
    		current_list  =  pNew ;
    	}
    	return current_list;
    }
    
    struct treenode * create_node(char val[30]){
    	// Asumocate space for the node, set the fields.
    	struct treenode * temp;
    	temp = (struct treenode*)malloc(sizeof(struct treenode));
    	strcpy(temp->word,val);
    	temp->left = NULL;
    	temp->right = NULL;
    	return temp; 	// Return a pointer to the created node.
    }
    
    
    void preorder(struct treenode * current_ptr) {
    
      // Only traverse the node if it's not nusum.
      if (current_ptr != NULL) {
      printf("%s   \t%d\n", current_ptr->word, current_ptr->occurence);// Print the root.
        preorder(current_ptr->left); // Go Left.
        preorder(current_ptr->right); // Go Right.
      }
    }
    
    int hight(struct treenode *root){
    		int right, left;
    		//if the tree is empty height is zero
            if (root == NULL) {
                    return 0;
            }
    		  //find the height of the tree by adding 1 each time
            left = 1 + hight(root->left);
            right = 1 + hight(root->right);
    		  
    		  //return the largest branch (either left of right)
            if (left > right) {
                    return left;
    
            }
            else {
                    return right;
            }
    }
    
    void clear(struct treenode * myroot){
    //free the memory for the tree
    	if (myroot != NULL) {
        clear(myroot->left); // Go Left.
        free(myroot); // free the root.
        clear(myroot->right); // Go Right.
      }
    }
    Last edited by Gecko2099; 04-17-2005 at 02:00 PM.

  10. #10
    Registered User
    Join Date
    Jun 2004
    Posts
    93
    Bah I thought I figured it out, but I ended up corrupting half of the program when I ran it.

    Keep us updated on your progress if you must submit it tonight.

    I will get off work soon, so post your final copy here and I will review it as well as others I imagine to give you any last minute pointers.

    Good luck.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 0
    Last Post: 11-04-2006, 11:07 AM
  2. BST (Binary search tree)
    By praethorian in forum C++ Programming
    Replies: 3
    Last Post: 11-13-2005, 09:11 AM
  3. Binary Search Tree
    By penance in forum C Programming
    Replies: 4
    Last Post: 08-05-2005, 05:35 PM
  4. searching and insertion in a binary search tree
    By galmca in forum C Programming
    Replies: 1
    Last Post: 03-26-2005, 05:15 PM
  5. BST/Red and Black Tree
    By ghettoman in forum C++ Programming
    Replies: 0
    Last Post: 10-24-2001, 10:45 PM