Thread: Storing Words from a text file

  1. #1
    Registered User
    Join Date
    Oct 2008
    Posts
    25

    Storing Words from a text file

    Hey I have to write a program that reads a text file that contains a list of words, 1 word per line. I have to store the unique words and count the occurrences of each unique word. When the file is completely read, I have to print the words and the number of occurrences to a text file. The output should be the words in alphabetical order along with the number of times they occur. Then print to the file some statistics.


    I have to use character arrays instead of strings. I must use a linear search to determine if a word is in the array. The array is an array of structures and that the key is a char array so the string comparison must be used. The search task should be a separate function.
    The search must be a separate function that returns an integer values. I cant use a for loop and the function must have only one return statement.

    The linear seach looks something like this-
    Code:
    int search (int list [], int size, int key)
    {
        int pos = 0;
        while (pos < size && list[pos] != key)
            pos++;
        if (pos == size)
            pos = -1;
        return pos;
    }
    But Before I search I need to store, which is what I'm having problems with. Heres my store function-

    Code:
    void displayFile (char fileName[], words array[] )
    {
    	int i = 0;
        ifstream inFile;
    	
        char line [101];
    	
    	inFile.open(fileName);
    
     while (inFile.getline(line,101))
     {   
    	cout << line << endl;
        array[i].word = line;
        i++;
     }   
    	inFile.close(); 
    
    }
    Where array, is an array of structs made up of character array"word" and an integer "count". Any help on storing the words and the number of words in the array would be helpfull.

  2. #2
    Registered User
    Join Date
    Mar 2007
    Posts
    416
    Unless I misunderstood something it looks like you have the storing of the word done already, and just need the number of repetitions of the word, right? (i hope?). Each time you read in a line do a linear search on the array of words to see if it already exists.

    Code:
    void displayFile (char fileName[], words array[] )
    {
    	int i = 0;
        ifstream inFile;
    	
        char line [101];
    	
    	inFile.open(fileName);
    
     while (inFile.getline(line,101))
     {   
    	cout << line << endl;
    //shove the search function in here
        array[i].word = line;
        i++;
     }   
    	inFile.close(); 
    
    }

  3. #3
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Man, all those restrictions... you're really taking the fun out of this, otherwise the whole program would be just a few lines of code using a std::map<std::string,int> container.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  4. #4
    Registered User
    Join Date
    Oct 2008
    Posts
    25
    Sorry about the first post I wasnt clear with my question. I'm still having trouble on the storing part.

    I want to store a text file line by line, where theres one word per line. I tried to store each line with a character array with

    array[i].word = line;

    but I get the syntax error of "'=': left operand must be a 1-value" so I tried

    line = array[i].word;

    but that didnt work neither.

    Any help would be appreciated.

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    If you have to use character arrays, then you need to use strcpy to copy a string from one place to another.

  6. #6
    Registered User
    Join Date
    Oct 2008
    Posts
    25
    Thanks you that worked. I'm now trying to use strncmp to find out the number of unique words (or lines) in the text file. I tried to use-

    Code:
    for( j=0; j<101; j++)
     {	
    	n = strcmp( array[j].word, array[j+1]);
    	
    	if(n != 0);
    	uni++;
     }
    
    cout << uni<< endl;
    }
    However that didnt work. I also tried to copy the array again to another character array like this-

    for (k=0; k<101; k++)
    strncpy(character[k], array[k].word, 20);

    and then comparing those two arrays, but that also gave me a syntax error. My instructer gives me these instuctions-

    You must use the linear search algorithm to determine if a word is in the array. Remember that the array is an array of structures and that the key is a string (char array) so the string comparison must be used. The search task should be a separate function.
    The search must be a separate function that returns an integer values. Do not use a for loop and the function must have only one return statement.

    He said not to use a for loop, but I really didnt know another way to go about it. But my for loop doesnt even work.

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Well, there are other loops beside for loops. Which is good, because you don't want to hard code 101 in there, I don't think.

    Anyway, I don't see what you were trying to do with strncpy, so I can't say anything about that. Your strcmp needs to compare the word you're searching for (which isn't necessarily a word on the list, I think?) with the words that are in the array.

  8. #8
    Registered User
    Join Date
    Oct 2008
    Posts
    25
    I need to compare words in a text file. In the text file there is only 1 word per line. I need to store the words, count the unique words, and do some other stats. I just realized that my strcmp will only compare 0-1, 1-2, 2-3 ect. Never 0-3. Also there is a max of 100 lines, and a max word length of 20. So I have to use linear search. I'll try that right now, I just wanted to clear some stuff up for you. Thanks for helping.

  9. #9
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    You would be better off not ever writing a duplicate word to your array. You can do this by reading a line and then searching your array of words for the just-read word. If found, bump the word count. Else, add it to the array and set the count to 1.
    Mainframe assembler programmer by trade. C coder when I can.

  10. #10
    Registered User
    Join Date
    Oct 2008
    Posts
    25
    You would be better off not ever writing a duplicate word to your array. You can do this by reading a line and then searching your array of words for the just-read word. If found, bump the word count. Else, add it to the array and set the count to 1.

    I have a couple of questions about this. Before I compare I need to store the words in an array, so if I compare "line" to "array[i].word" they would always be equal. So I used strcpm(line, array[i+1]). Heres my code-

    Code:
     while (inFile.getline(line,101))
     {   
    	
       
       strncpy(array[i].word, line, 20);
    	
       n = (strcmp(line, array[i+1].word);
    
       if( n == 0)
    	count++
    
    else {}
    	
    	
    	i++;
        
     }
    I dont know what to put with the else statment. Should I make a new character array to hold only the uniques words? Also, with this method will array[i].word ever be in the unique array.

  11. #11
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Basically, you would do this:

    Code:
    int word_searh( char * word, int size ) { 
    	int i = size ;
    	while (i) { 
    		if (strcmp(word, words[i-1].word == 0) { 
    			words[i-1].count++ 
    			return size ;   // existing word count bumped, return size of array
                }
               i-- ; 
    	}
    	// Since we got here, the word does not exist. Add it 
    	strcpy(words[size].word, word) ; 
    	words[size].count = 1 ; 
    	return size+1 ;   // new size of array  
    } 
    
    while (inFile.getline(line,101)) { 
    
    	// Search "words" array for a match andadd if new, else 
    	// bump word count if found 
    	size = word_search(line, size) ;  
     }
    NOT SYNTAX CHECKED!! - but that's the concept.
    Last edited by Dino; 12-04-2008 at 05:48 PM. Reason: fixed while loop - thanks matsp
    Mainframe assembler programmer by trade. C coder when I can.

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Surely i needs to change in the while-loop?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  13. #13
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Fixed it. Thanks.
    Mainframe assembler programmer by trade. C coder when I can.

  14. #14
    Registered User
    Join Date
    Oct 2008
    Posts
    25
    Ok, heres my revised code-

    Code:
    #include <iostream>
    #include <fstream>
    #include <cstdlib>
    #include <string>
    #include <iomanip>
    using namespace std;
    
        int const wordLength = 21;
    	int const Num = 100;
    	int const fileSize = 255;
    
    	struct words
    	{
    		char word[wordLength];
    		int count;	
    	};
    
    
    	
    int storeFile( char [], words []);
    void wordSearchSetup(char[], int, words[]);
    int wordSearch(char[], int, words[]);
    
    void main ()
    {
        int count;
    	char fileName[fileSize];
    	
    	cout << "Please enter the name of the file you wish to open: "<< endl;
    	cin.getline(fileName,fileSize);
    	
    	words array[Num];
    	
    	count = storeFile (fileName, array);
    		
    	
    	cin.ignore();
    }
    
    int storeFile (char fileName[], words array[] )
    {	
    	int count = 0;
    	int i = 0;
        ifstream inFile;
        char line [Num];
    	
    	inFile.open(fileName);
    
    	while (inFile.getline(line,Num))
    	{   
         strncpy(array[i].word, line, wordLength);
    	 count++;
    	 i++;  
    	}   
    	
    	inFile.close(); 
    
    	wordSearchSetup( fileName, count, array);
       return count; 
    }
    
    void wordSearchSetup(char fileName[], int count, words array[])
    {
    	char line[Num];
    	int i = 0;
    	int size;
    	ifstream inFile;
    	inFile.open(fileName);
    	
    	while (inFile.getline(line,Num))
    	 size = wordSearch(line, count, array);
       
    	cout << size << endl;
    
    	
    }
    
    
    int wordSearch( char line[], int count, words array[])
    {
    	int i = count;
    	
    	while (i)
    	{
    		if (strcmp(line, array[i-1].word) == 0) 
    		{ 
    			array[i-1].count++; 
    			return  count;   
            }
               i-- ; 
    	}
        
    	strcpy(array[count].word, line) ; 
    	array[count].count = 1 ; 
    	return count+1;   
    }
    My search functions still arent giving me the results I want, it just returns the number of words, not unique words.

  15. #15
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Yuk. You're making two passes thru the code when one is enough.

    Seeing as you still have work to do, I'll help you over this hump, and you can finish it be doing the proper reporting of stats at the end.
    Code:
    #include <iostream>
    #include <fstream>
    #include <cstdlib>
    #include <string>
    #include <iomanip>
    using namespace std;
    
    int const wordLength = 21;
    int const Num = 100;
    int const fileSize = 255;
    
    struct words
    {
    	char word[wordLength];
    	int count;	
    };
    
    
    	
    int storeFile( char [], words []);
    //void wordSearchSetup(char[], int, words[]);
    int wordSearch(char word[], int array_size, words * array);
    
    int main ()
    {
    	int count;
    	char fileName[fileSize];
    	count = 0 ; 
    	cout << "Please enter the name of the file you wish to open: "<< endl;
    	cin.getline(fileName,fileSize);
    	if (!cin.good() ) { 
    		cout << "Error reading cin..." << endl ; 
    		return -1 ; 
    	} 
    	
    	words array[Num];
    	
    	count = storeFile(fileName, array);
    	cin.ignore();
    }
    
    int storeFile (char fileName[], words array[] )
    {	
    	//int count = 0;
    	int i = 0;
    	ifstream inFile;
    	char line [Num];
    	int array_size = 0 ;  // Count of unique words in array 
    	inFile.open(fileName);
    
    	while (inFile.getline(line,Num))
    	{   
    		array_size = wordSearch(line, array_size, array);
    		//strncpy(array[i].word, line, wordLength);
    		//count++;
    		i++;   // Words read 
    	}   
    	
    	inFile.close(); 
    
    	//wordSearchSetup( fileName, count, array);
    	return count; 
    }
    
    /* 
    void wordSearchSetup(char fileName[], int count, words array[])
    {
    	char line[Num];
    	//int i = 0;
    	int size;
    	ifstream inFile;
    	inFile.open(fileName);
    	
    	while (inFile.getline(line,Num))
    	 size = wordSearch(line, count, array);
       
    	cout << size << endl;	
    }
    */ 
    
    int wordSearch( char line[], int array_size, words * array)
    {
    	int i = array_size ;  // Start search at the end of the array
    	
    	while (i && array_size > 0)  // Don't do is there are no elements in array 
    	{
    		if (strcmp(line, array[i-1].word) == 0)  // do the words match? 
    		{ 
    			array[i-1].count++;   // Yes, bump the count and return
    			return  array_size ;  // Array size did not change 
    		}
    		i-- ;  // index backwards into array 
    	}
        // If we got here, we found a unique word.  Add it at the end and return
    	strcpy(array[array_size].word, line) ; 
    	array[array_size].count = 1 ; 
    	return array_size+1;   
    }
    Last edited by Dino; 12-05-2008 at 12:16 AM. Reason: added comments to new or changed code
    Mainframe assembler programmer by trade. C coder when I can.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. storing text file contents into a string
    By m.mixon in forum C Programming
    Replies: 4
    Last Post: 07-20-2006, 11:52 AM
  2. A bunch of Linker Errors...
    By Junior89 in forum Windows Programming
    Replies: 4
    Last Post: 01-06-2006, 02:59 PM
  3. Problem with malloc() and sorting words from text file
    By goron350 in forum C Programming
    Replies: 11
    Last Post: 11-30-2004, 10:01 AM
  4. Removing text between /* */ in a file
    By 0rion in forum C Programming
    Replies: 2
    Last Post: 04-05-2004, 08:54 AM
  5. Storing words from a file to an array
    By SIKCAR in forum C Programming
    Replies: 10
    Last Post: 09-09-2002, 06:47 AM