Thread: Finding anagrams in a word list

  1. #1
    Registered User
    Join Date
    Oct 2010
    Posts
    135

    Finding anagrams in a word list

    I have a word list and a file containing a number of anagrams. These anagrams are words found in the word list. I need to develop an algorithm to find the matching words and produce them in an output file. The code I have developed so far has only worked for the first two words. In addition, I can't get the code to play nice with strings containing numbers anywhere in it. Please tell me how I can fix the code.

    Code:
    #include <iostream>
    #include <fstream>
    #include <string>
    using namespace std;
    
    int main (void)
    {
        int x = 0, y = 0;
        int a = 0, b = 0;
        int emptyx, emptyy;
        int match = 0;
        ifstream f1, f2;
        ofstream f3;
        string line, line1[1500], line2[50];
        size_t found;
    
        f1.open ("wordlist.txt", ios::in);
        f2.open ("file.txt", ios::in);
        f3.open ("output.txt", ios::out);
    
        //stores content into string arrays
        if (f1.is_open() && f2.is_open())
        {
            while (f1.eof() == 0)
            {
                getline (f1, line);
                line1[x] = line;
                x++;
            }
    
            while (f2.eof() == 0)
            {
                getline (f2, line);
                line2[y] = line;
                y++;
            }
    
            //finds position of last elements
            emptyx = x-1;
            emptyy = y-1;
    
            //matching algorithm
            for (y = 0; y <= emptyy; y++)
            {
                for (x = 0; x <= emptyx; x++)
                {
                    if (line2[y].length() == line1[x].length())
                    {
                        for (a = 0; a < line1[x].length(); a++)
                        {
                            found = line2[y].find(line1[x][a]);
                            if (found != string::npos)
                            {
                                match++;
                                line2[y].replace(found, 1, 1, '.');
    
                                if (match == line1[x].length())
                                {
                                    f3 << line1[x] << ", ";
                                    match = 0;
                                }
                            }
                        }
                    }
                }
            }
    
            f1.close();
            f2.close();
            f3.close();
        }
    
        //file access error
        else
            cout << "Input error.";
    
        return 0;
    }
    EDIT: Oops, I left the x and y uninitialized, causing the code to crash
    Last edited by 843; 06-21-2011 at 02:40 AM.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > line2[y].replace(found, 1, 1, '.');
    The problem with this, is that if line2[y] does not end up being a match, you've trashed it for matching against something later on in the list.

    Also, see the FAQ on why using feof() in a loop is bad.

    while ( getline (f2, line) )
    should be used.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Oct 2010
    Posts
    135
    Thanks! I implemented a different method, which involves creating a set of new lists with all the letters and words sorted alphabetically, which works.

    However, I still can't figure out how to store strings which contain number(s) in them, so I decided to cheat my way out by removing them from the list. What is a good solution for this?

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    You would have to post your code, and attach some short example data files which show the problem.
    There is nothing special about letters or digits in your original code (at least).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Oct 2010
    Posts
    135
    You're right! I tried again and apparently it works as it should with the new algorithm. It didn't work with the above which was why I removed them. Thanks for the help!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Finding word in file
    By 843 in forum C Programming
    Replies: 11
    Last Post: 12-03-2010, 12:02 PM
  2. Finding the longest word in a string.
    By whoami2 in forum C Programming
    Replies: 2
    Last Post: 04-14-2007, 11:00 AM
  3. Finding a word in a string.
    By esbo in forum C Programming
    Replies: 15
    Last Post: 08-28-2006, 07:48 PM
  4. Question About Finding a Word and Replacing It
    By Zildjian in forum C Programming
    Replies: 3
    Last Post: 09-23-2003, 08:50 AM
  5. Replies: 3
    Last Post: 02-08-2002, 10:15 PM