Like Tree1Likes
  • 1 Post By laserlight

From a failed contract: simple C++ program.

This is a discussion on From a failed contract: simple C++ program. within the C++ Programming forums, part of the General Programming Boards category; Hello again everyone, and thank you for your time. I was examining prospects on ELance the other day, and I ...

  1. #1
    Registered User
    Join Date
    Jul 2010
    Location
    Oklahoma
    Posts
    107

    Lightbulb From a failed contract: simple C++ program.

    Hello again everyone, and thank you for your time. I was examining prospects on ELance the other day, and I found one that looked like a good chance to practice some C++ with the standard template library...

    The request for proposal provided the problem statement rather directly:

    I have two input files:

    * file 1 containing suffixes, description 1 of suffixes, description 2 of suffixes, ranking in float or decimal format (comma delimited)
    * file 2 containing words and rankings (char and int, possibility that the rankings would be all the same in which case we should sort by alphabetical order on the word)

    I need to produce an output file that containing an output list that shows all words that had a matching suffix in the suffix file, and sorted by the ranking in file 2.

    The format of the output file (comma delimited) should be:
    word from file 2, suffix from file 1, ranking from file 2, description 1 from file 1, description 2 from file 1, ranking from file 1

    This is a simple C++ program, I expect an experienced programmer could write it in under an hour, I simply don't have the time to play with string arrays vs. character arrays, and so on.

    Sample file 1:

    ed, latin suffix, roman suffix, 12.46
    ing, german suffix, greek suffix, 4.45
    tion, french suffix, german suffix, 4.45

    Sample file 2:
    Declared, 3
    Bastion, 4
    Tiring, 4

    Output should look like:
    Declared, ed, 3, latin suffix, roman suffix, 12.46
    Bastion, tion, 4, french suffix, german suffix, 4.45
    Tiring, ing, 4, german suffix, greek suffix, 4.45

    The program should not be case sensitive at all. Sort order for the output file should be alphanumeric on field 3, field 6, field 1, field 2, field 4, field 5.

    This is a simple program but I haven't played with C++ in a while. The program should come fully commented and take 3 args, file 1 name, file 2 name, file 3 name. Keep in mind that I'm running on windows so I should be able to give a full file name including backslashes (so you'll probably have to escape them). File 1 is around 500 lines, file 2 is around 500,000 lines so you need to make this perform decently. I can't imagine this program taking more than 10 minutes to run.
    I strongly suspect it was homework, because the request for proposal was rather immediately closed. I did write a solution using GNU's C++ compiler though, and I was interested in some one's insight. I used the STL reference at cplusplus.com and found what I needed to finish it up, although I don't have access to the scale of test material the request for proposal stipulated.

    When I did the ERD between the suffixes and the word occurrences, I noticed that there would be a 1 to many relationship. In an attempt to avoid having multiple copies of the suffixes in a list that will eventually be sorted, I checked the STL for a mapping type structure. I didn't find one that satisfied me, but I'm up for more reading at cplusplus.com...if you have any suggestions? I would really like to pontificate this one with any of you if you have the time or inclination.

    So keeping the copies lying around, this is the structure that I settled on for an 'etymology' of a word with it's suffix...

    Code:
    class Etymology
    {
       private:
          Word w;
          Suffix s;
    
       public:
          Etymology(fstream &in, vector<Suffix> v);
    
          bool empty() const { return w.getWord().empty(); }
          bool operator<(const Etymology &z) const;
    
          friend fstream &operator<< (fstream &out, const Etymology &e);
    };
    During construction, I read in a next word then do a linear search for it's suffix, and finally make a copy. A little slow, but it lends itself to parallelism until I get a chance to test it's run times:

    Code:
    Etymology::Etymology(fstream &in, vector<Suffix> v)
    {
       bool err;
       unsigned int i;
    
       vector<Suffix>::size_type sz = v.size();
    
       in >> w;
    
       if(w.getWord().empty())
       {
          return;
       }
    
       for(i = 0, err = true; i < sz; i++)
       {
          if(w == v[i])
          {
             s = v[i];
             err = false;
             break;
          }
       }
    
       if(err)
       {
          cerr << "No matching suffix for: " << w.getWord() << endl;
       }
    
       return;
    }
    This is the routine where the etymologies are extracted/organized in the summary of word statistics:

    Code:
    void Summary::matchWords()
    {
       fstream fin(words.c_str(), ios::in);
    
       if( ! fin )
       {
          cerr << "Unable to open " << words.c_str() << " for reading words." << endl;
          cerr << "Exiting...." << endl;
    
          exit(1);
       }
    
       do
       {
          Etymology e(fin, suffixes);
    
          if( ! e.empty())
          {
             observation.push_back(e);
          }
       }
       while(fin.good());
    
       fin.close();
       suffixes.clear();
       observation.sort();
    
       return;
    }
    Is there some way to do that using the standard template library without leaving a large number of copies? I thought about different reference types, but without something like making the ordering operator static nothing clicked for me.

    Thank you again. Please do have a great day.

    Best Regards,

    New Ink -- Henry
    Kept the text books....
    Went interdisciplinary after college....
    Still looking for a real job since 2005....

    During the interim, I may be reached at ELance, vWorker, FreeLancer, oDesk and WyzAnt.

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,647
    This:
    Code:
    Etymology(fstream &in, vector<Suffix> v);
    should be:
    Code:
    Etymology(std::istream &in, const std::vector<Suffix>& v);
    I changed the fstream to istream so your interface is more flexible. The vector should be passed by const reference since you don't need to copy it. I qualified the type names because the class definition would likely go in a header file, where using directives and using declarations should be avoided except within a restricted scope.

    Actually, if you want to provide a more generic interface, you should consider that std::vector can have other template arguments, in particular, it has at least an allocator template argument in addition to the element type. One way around this is to take an iterator pair to specify a range.

    Also, if the "No matching suffix" error should prevent the creation of an Etymology object, then you should throw an exception instead of just printing an error message to std::cerr.

    Oh, and this:
    Code:
    fstream &operator<< (fstream &out, const Etymology &e);
    should be:
    Code:
    ostream &operator<< (ostream &out, const Etymology &e);
    In the header where Etymology is defined, you would #include <iosfwd>. In the corresponding source file, you would #include <istream> and #include <ostream>. You do need to #include <iostream> for std::cerr, but you only need to #include <fstream> in the file where you need std::fstream.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Jul 2010
    Location
    Oklahoma
    Posts
    107

    Red face

    Quote Originally Posted by laserlight View Post
    This:
    Code:
    Etymology(fstream &in, vector<Suffix> v);
    should be:
    Code:
    Etymology(std::istream &in, const std::vector<Suffix>& v);
    Yes, it should...talk about copies lying around, gee whiz. I overlooked that while I was tinkering with the documentation.

    Quote Originally Posted by laserlight View Post
    I qualified the type names because the class definition would likely go in a header file, where using directives and using declarations should be avoided except within a restricted scope.
    I declared the standard namespace in all of the the headers/driver files, due to the small size of the project it seemed unnecessary to worry about convolution... But I think it's a great idea to stay in the habit of using fully qualified names.

    Quote Originally Posted by laserlight View Post
    Actually, if you want to provide a more generic interface, you should consider that std::vector can have other template arguments, in particular, it has at least an allocator template argument in addition to the element type. One way around this is to take an iterator pair to specify a range.
    I started reading about those. It reminded me of the material I read involving the Linux kernel being loaded into random locations in memory. The examples that were at cplusplus.com left something to be desired though. Where might I find more meaningful applications for allocators?

    Quote Originally Posted by laserlight View Post
    Also, if the "No matching suffix" error should prevent the creation of an Etymology object, then you should throw an exception instead of just printing an error message to std::cerr.
    I thought of that, it does make the routine in the summary's word matching much more elegant. This was only a draft when I found out that the client had closed out without awarding. I didn't want to give them more than they paid for in a prototype.

    Quote Originally Posted by laserlight View Post
    In the header where Etymology is defined, you would #include <iosfwd>.
    I'll have to look that one up. I was only interested in the file stream interaction, because I didn't expect standard i/o to be used for the extraction of suffix or word structures. It's sort of a batch relationship and occupying the terminal with it seemed too much.

    Thank you for the attention LaserLight.

    Best Regards,

    New Ink -- Henry
    Kept the text books....
    Went interdisciplinary after college....
    Still looking for a real job since 2005....

    During the interim, I may be reached at ELance, vWorker, FreeLancer, oDesk and WyzAnt.

  4. #4
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,647
    Quote Originally Posted by new_ink2001
    I was only interested in the file stream interaction, because I didn't expect standard i/o to be used for the extraction of suffix or word structures. It's sort of a batch relationship and occupying the terminal with it seemed too much.
    By using istream and ostream instead of fstream, you open up to the possibility of using stringstreams or using standard I/O redirected/piped from a shell script.
    new_ink2001 likes this.
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Contract Cheating
    By jim mcnamara in forum A Brief History of Cprogramming.com
    Replies: 61
    Last Post: 05-22-2009, 06:29 AM
  2. Contract Positions... lots of them
    By Jaqui in forum Projects and Job Recruitment
    Replies: 0
    Last Post: 10-13-2006, 06:57 AM
  3. Sydney .NET 2.0 Contract Opportunity
    By nickname_changed in forum Projects and Job Recruitment
    Replies: 0
    Last Post: 06-05-2005, 08:19 PM
  4. Contract and Append to arrays
    By kinghajj in forum C# Programming
    Replies: 2
    Last Post: 03-31-2005, 07:21 PM
  5. Contract anyone?
    By RoD in forum A Brief History of Cprogramming.com
    Replies: 19
    Last Post: 10-08-2004, 07:03 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21