Thread: Need help reading, sorting, and printing a .txt file

  1. #1
    Registered User
    Join Date
    Oct 2012
    Posts
    12

    Need help reading, sorting, and printing a .txt file

    I've been working on this project for my class for at least a week and I have had so many issues since I am just a novice who tried to jump into an advanced C++ class without prior programming knowledge. The project is to take a .txt file, then sort it in alphabetical order and finally print it out to the append.dat file.

    I was wondering if I could get some help figuring out where I'm going wrong based on my criteria:

    1. My program is printing out nothing to my append.dat file, which is pretty important for the project
    2. The program cannot have any repeats, and it must remove excess characters (like commas and periods) and read again to remove words in which those excess characters would cause repeats
    3. Hyphens must be removed if they are not in the middle of a word

    4. With less importance, I'd like to make all words lowercase, which is what the "string loup" is trying to achieve. However, when active, I get the error "string subscript out of range"
    5. Also less importance, if it's possible to condense this to contain less functions, that would be great too

    This is my current progress (sorry if the comments are excessive):

    Code:
    #include<iostream>
    #include<string>
    using namespace std;
    #include<conio.h>
    #include<fstream>
    
    
    /*string sorter(string line)
    {
        int temp;
        string words;
        temp=line.length();
        for(int l=0; l<=line.length; l++)
        {
            if(words.substr(l, l)==" ")
            {
    
    
            }
        }
        words.substr(0, 1);
        return words;
    }*/
    
    
    /*string loup(string c)
    {
        string temp="";
        for(int i=0; i<c.length()+1; i++)
        {
            temp+=tolower(c[i]);
        }
        return temp;
    }*/
    
    
    void print(string dictionary[10000])
    {
        for(int r=0; r<10000; r++)
        {
            cout<<dictionary[r]<<endl;
        }
    }
    
    
    void bubble(string dictionary[10000])
    {
        string temp;
        for(int y=0; y<10000; y++)
        
        {
            for(int g=0; g<10000; g++)
            {
                if(g==9999)
                {
                    break;
                    g=0;
                }
                else if(dictionary[g]>dictionary[g+1])
                {
                    temp=dictionary[g];
                    dictionary[g]=dictionary[g+1];
                    dictionary[g+1]=temp;
                }
                else if(dictionary[g]==dictionary[g+1])
                {
                    dictionary[g]="";
                }
            }
            if(y%1000==0)
            {
                cout<<"Current progress: "<<y<<endl;
            }
        }
        
    }
    
    
    string punctuation(string word[10000])
    {
        string temp="";
        for(int p=0; p<10000; p++)
        {
            for(int s=0; s<word[p].length(); s++)
            if(word[p][s]==tolower(word[p][s]) && (word[p][s]==toupper(word[p][s]))||(word[p][s]='\'')||(word[p][s]='-'))
            {
                    temp+=word[p];
            }
        }
        string hyphen;
        return temp;
    }
    
    
    string hyphen(string hyp)
    {
        if(hyp.length()!=0)
        {
            if(hyp[0]=='-')
            {
                hyp[0]=' ';
            }
        }
        else
        {
            return hyp;
        }
    }
    
    
    void output(string dictionary[10000])
    {
        ofstream outFile;
        outFile.open("append.dat", ios::app);
        for(int x=0; x<10000; x++)
        {
            if((dictionary[x]!="") && (dictionary[x]!=" "))
            {
                outFile<<dictionary[x]<<endl;
            }
        }
        outFile.close();
    }
    
    
    void main()
    {
        ifstream inFile;
        string dictionary[10000], word[10000];
        inFile.open("SDawg.txt");
        int a=0;
        //string test;
        while(!inFile.eof())
        {
            //getline(inFile, dictionary[a]);
            if(a<10000)
            {
                inFile>>dictionary[a];
                //dictionary[a]=loup(dictionary[a]);
                a++;
            }
            else
            {
                break;
            }
            //a++;
            //test=line[a].substr(a, 1);
            //cout<<dictionary[a];
            //a++;
            /*if(dictionary[a]==" ");
            {
                temp=a;
                dictionary[a]=line[a].substr(temp, a);
                cout<<dictionary[a]<<endl;
    
    
            }
            a++;*/
        }
        for(int m=0; m<10000; m++)
        {
            dictionary[m]=punctuation(word);
        }
        bubble(dictionary);
        output(dictionary);
        /*for(int b=0; b<(b+1); b++)
        {
            sorter(dictionary[b]);
        }*/
        //cout<<dictionary<<endl;
        //inFile>>dictionary;
        /*outFile.open("append.dat", ios::app);
        outFile.close();*/
    }
    I'm not asking for a revamp of the code, just some help finding my errors. Deadlines are coming up and I'd prefer to exempt the final. Thanks in advance!

  2. #2
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    In order to write in a file,open it with fopen fopen - C++ Reference

    Also replace
    Code:
    void main
    with
    Code:
    int main

  3. #3
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    Is there a reason you are using arrays instead of vectors? Vectors would make this much easier. First what happens if your file doesn't contain 10,000 strings? Most of your functions assume 10,000 strings. So you should either pass the number of actual strings in your array, or use vectors since a vector knows how many elements it contains.

    One of your problems is probably being caused by the following code:
    Code:
    string hyphen(string hyp)
    {
        if(hyp.length()!=0)
        {
            if(hyp[0]=='-')
            {
                hyp[0]=' ';
            }
        }
        else
        {
            return hyp;
        }
    }
    You only return a value if the length is equal to zero, you probably don't want the else statement.

    4. With less importance, I'd like to make all words lowercase, which is what the "string loup" is trying to achieve. However, when active, I get the error "string subscript out of range"
    Look closely at this code:
    Code:
    string loup(string c)
    {
        string temp="";
        for(int i=0; i<c.length()+1; i++)
        {
            temp+=tolower(c[i]);
        }
        return temp;
    }
    By adding one to your loop condition statement you will try to access the string past the end of the string.

    5. Also less importance, if it's possible to condense this to contain less functions, that would be great too
    Why? The number of functions should not be an issue. But you really want to have each function do as little as possible. For example I wouldn't remove "duplicates" in your sort routine. Just sort the elements, leave the removal for another function. By the way, just truncating the string might cause problems later because you may have this "empty" string between valid strings, or a bunch of empty strings at the beginning or end of your array. This is another area that makes using vectors attractive. You can actually remove the "duplicate" from the vector.

    Jim
    Last edited by jimblumberg; 10-07-2012 at 11:01 AM.

  4. #4
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Code:
    while(!inFile.eof())
    {
            //getline(inFile, dictionary[a]);
            if(a<10000)
            {
                inFile>>dictionary[a];
                //dictionary[a]=loup(dictionary[a]);
                a++;
            }
            else
            {
                break;
            }
            //a++;
            //test=line[a].substr(a, 1);
            //cout<<dictionary[a];
            //a++;
            /*if(dictionary[a]==" ");
            {
                temp=a;
                dictionary[a]=line[a].substr(temp, a);
                cout<<dictionary[a]<<endl;
     
     
            }
            a++;*/
    }
    Boy there's a lot of this today... this must be my 3rd of 4th one so far. Anyway, avoid using an end-of-file test to control your loops. You should instead test the read operation for success/failure. I would suggest a far more simpler/compact loop more similar to this:
    Code:
    while( a < 10000 && getline(inFile,dictionary[a]) )
    {
        // Maybe run your "lower" function here perhaps?
        ++a;
    }
    I would also recommend the use of a vector as has been stated by others here. The <algorithm> header has a built-in sort function that can be used on arrays or vectors of string to sort things for you automagically... if you want.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  5. #5
    the hat of redundancy hat nvoigt's Avatar
    Join Date
    Aug 2001
    Location
    Hannover, Germany
    Posts
    3,130
    A very general remark: it's easier to read your code, if you name your functions with a verb-object schema. Take "punctuation" or "hyphen". I have no idea what it should do and therefore I have no idea if your code is correct. If it was named "remove_hyphen" I would say the code is incorrect, because it won't remove all hyphens. It won't even remove the first hyphen if it's in the middle of the word. Maybe the best name would be "remove_hyphen_at_front". Or maybe not, only you can know, because only you can know what it should do
    hth
    -nv

    She was so Blonde, she spent 20 minutes looking at the orange juice can because it said "Concentrate."

    When in doubt, read the FAQ.
    Then ask a smart question.

  6. #6
    Registered User
    Join Date
    Oct 2012
    Posts
    12
    Okay...I tried to work around on this. I still don't understand the idea of vectors, so I didn't include any. Using this updated code, I found that 1.) The code gives repeated error values, and 2.) I still can't export to the append.dat file.

    I'm also including the .txt file:
    SDawg.txt

    Code:
    #include<iostream>
    #include<string>
    #include<conio.h>
    #include<fstream>
    #include<algorithm>
    using namespace std;
    
    
    
    void bubble(string dictionary[10000])
    {
        string temp;
        for(int y=0; y<10000; y++)
        {
            for(int g=0; g<10000; g++)
            {
                if(g==9999)
                {
                    break;
                    g=0;
                }
                else if(dictionary[g]>dictionary[g+1])
                {
                    temp=dictionary[g];
                    dictionary[g]=dictionary[g+1];
                    dictionary[g+1]=temp;
                }
                else if(dictionary[g]==dictionary[g+1])
                {
                    dictionary[g]="";
                }
            }
            if(y%1000==0)
            {
                cout<<"Current progress: "<<y<<endl;
            }
        }
    }
    
    
    void output(string dictionary[10000])
    {
        ofstream outFile;
        outFile.open("append.dat", ios::app);
        int x=0;
        if((dictionary[x]!="") && (dictionary[x]!=" "))
        {
            outFile<<dictionary[x]<<endl;
        }
        outFile.close();
    }
    
    
    int main()
    {
        int a=0;
        ifstream inFile;
        string dictionary[10000], word[10000];
        inFile.open("SDawg.txt");
        while(a<10000 && getline(inFile, dictionary[a]))
        {
            cout<<dictionary<<endl;
            sort(dictionary[a].begin(), dictionary[a].end());
            transform(dictionary[a].begin(), dictionary[a].end(), dictionary[a].begin(), tolower);
            ++a;
        }
        bubble(dictionary);
        output(dictionary);
    }

  7. #7
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    Looking at your file it looks like you should be using the extraction operator and not getline(). The extraction operator will extract words, getline() will extract whole lines.

    You could use a std::map instead of the vector, which will sort the entries automatically and also a map doesn't allow duplicates.

    Jim

  8. #8
    Registered User
    Join Date
    Oct 2012
    Posts
    12
    Quote Originally Posted by jimblumberg View Post
    Looking at your file it looks like you should be using the extraction operator and not getline(). The extraction operator will extract words, getline() will extract whole lines.
    What would I place to retrieve the words? The code I'm writing either does not match arguments or shows no progress in Win32.

  9. #9
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    What would I place to retrieve the words?
    You would use a std::string to retrieve the words.

    The code I'm writing either does not match arguments or shows no progress in Win32.
    Show the code then maybe someone can point you in the right direction.

    Jim

  10. #10
    Registered User
    Join Date
    Oct 2012
    Posts
    12
    Code:
    #include<iostream>
    #include<string>
    using namespace std;
    #include<conio.h>
    #include<fstream>
    #include<algorithm>
    
    ...
    
    int main(){
        int a=0;
        ifstream inFile;
        string dictionary[10000], word[10000];
        inFile.open("SDawg.txt");
        while(a<10000 && cin>>dictionary[a]) //one of my test changes
        {
            cout<<dictionary<<endl;
            sort(dictionary[a].begin(), dictionary[a].end());
            transform(dictionary[a].begin(), dictionary[a].end(), dictionary[a].begin(), tolower);
            ++a;
        }
        bubble(dictionary);
        output(dictionary);
    }

  11. #11
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    The code I'm writing either does not match arguments or shows no progress in Win32.
    What are your error messages?

    Part of your problem is being caused by the position of your using namespace statement. This should be after all your include files. Another problem is being caused because you are using the using namespace std statement. The tolower() function that you want to use is in the global namespace, not the tolower() in the std namespace. For your transform you will need to scope it like:
    Code:
            transform(dictionary[a].begin(), dictionary[a].end(), dictionary[a].begin(), ::tolower);
    Notice the scope operator:: in front of tolower.

    This name clash is one of the reasons why using namespace std should be avoided whenever possible.

    Jim
    Last edited by jimblumberg; 10-14-2012 at 10:05 PM.

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Perhaps if I quote this piece of code you will realise the absurdity of it:
    Quote Originally Posted by hss1194 View Post
    Code:
            for(int g=0; g<10000; g++)
            {
                if(g==9999)
                {
                    break;
                    g=0;
                }
    First of all, the g=0; line can never be reached. Once the break is reached the loop exits.

    Secondly, once g equals 9999 the if-statement simply skips what would have been the final iteration of the loop. It would have been a whole lot simpler to just change the loop condition to g < 9999 and thus end the loop one iteration earlier.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Registered User
    Join Date
    Oct 2012
    Posts
    12
    Alright, this is a working code after I got some extra help from my classmate:

    Code:
    #include<iostream>
    #include<string>
    #include<conio.h>
    #include<fstream>
    #include<algorithm>
    using namespace std;
    
    void bubble(string *dictionary)
    {
        string temp;
        for(int y=0; y<9999; y++)
        {
            for(int g=0; g<9998; g++)
            {
                if(dictionary[g]>dictionary[g+1])
                {
                    temp=dictionary[g];
                    dictionary[g]=dictionary[g+1];
                    dictionary[g+1]=temp;
                }
                if(dictionary[g]==dictionary[g+1])
                {
                    dictionary[g]="";
                }
                
            }
            if(y%100==0)
            {
                cout<<"Current progress: "<<y<<endl;
                cout<<dictionary[y]<<endl;
            }
            
        }
    }
    
    void output(string *dictionary)
    {
        ofstream outFile;
        outFile.open("append.dat");//, ios::app);
        int x=0;
    
    
        while(x<9999)
        {//if(x>2000)
        
            if((dictionary[x]!="") && (dictionary[x]!=" ") )
                    outFile<<dictionary[x]<<endl; x++;
            
            
    
    
        if(x%100==0)
            cout<<x<<endl;
            
        }
    outFile.close();
        cout<<"done"<<endl;
    }
    
    int main()
    {
        int a=0;
        ifstream inFile;
        string dictionary[10000], word[10000];
        string input, temp;
        inFile.open("SDawg.txt");
        while(a<9999)
        {temp="";
            getline(inFile,input);
            for(int i=0; i<input.length(); i++)
            {
            input[i]=tolower(input[i]);
              if( (input[i]!=' ') && (input[i]>='a') && (input[i]<='z') )
                    temp+=input[i];
              if(input[i]==' ')
              {
                  if(a<9999)
                  {
                  dictionary[a]=temp;
                       temp="";
                  cout<<dictionary[a]<<endl;
                  a++;
                  }
              }
            }
        }
        bubble(dictionary);
        output(dictionary);
    }
    I do have a project that spins off this one. The objective is to pick out words from the append.dat that are pseudo palindrome (that is, moving the first letter to the end of the word and be have the same words backwards as it was forwards. i.e. banana would be ananab which is still banana backwards.

    My thoughts on this is just to remove the first character from each line of text and then run a palindrome on it (see working copy below), but I need to create a function that removes the first character of each line before I run palindrome on it (and I shouldn't need to run this through bubble again). What would be the best way to go about this?

    Palindrome: (I would not run this as an if/else, just remove the equal algorithm and replace it in the main)

    Code:
    #include<iostream>
    #include<conio.h>
    #include<string>
    #include<algorithm>
    using namespace std;
    
    
    void pal(const std::string& str)
    {
        if(std::equal(str.begin(), str.begin() + str.size()/2, str.rbegin()))
        {
            cout<<"This text is a palindrome"<<endl;
        }
        else
        {
            cout<<"This text is not a palindrome"<<endl;
        }
    }
    
    
    int main()
    {
        int i;
        string str;
        cout<<"Enter a word/words"<<endl;
        getline(cin, str);
        transform(str.begin(), str.end(), str.begin(), tolower);
        for(i=0; i<str.length();)
        {
            if(str[i]==' '||str[i]==','||str[i]=='\''||str[i]=='?'||str[i]=='!')
            {
                str.erase(i,1);
            }
            else
            {
                i++;
            }
        }
        pal(str);
        getch();
    }

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reading data from a file and printing it
    By Linux Trojan in forum C Programming
    Replies: 27
    Last Post: 07-02-2011, 12:45 PM
  2. Reading a text file and printing it out
    By rolsis in forum C Programming
    Replies: 4
    Last Post: 02-13-2011, 07:54 AM
  3. Reading and Printing a text file
    By Mikahcho in forum C Programming
    Replies: 20
    Last Post: 12-06-2010, 03:46 PM
  4. NEED HELP READING FILE and PRINTING
    By geoffr0 in forum C Programming
    Replies: 4
    Last Post: 04-16-2009, 05:26 PM
  5. Reading and printing a file
    By T1m in forum C Programming
    Replies: 1
    Last Post: 01-08-2009, 01:29 PM