Simple Array or Not?

This is a discussion on Simple Array or Not? within the C++ Programming forums, part of the General Programming Boards category; Here's what I'm having trouble with: Large text file: I must pull out certain rows if they are in one ...

  1. #1
    Registered User
    Join Date
    Dec 2004
    Posts
    45

    Simple Array or Not?

    Here's what I'm having trouble with:

    Large text file: I must pull out certain rows if they are in one of the 4 groups put that into a file. Called results.all Location in the file is constant at (302,2).

    Then I have to exclude from that file (results.all) a list of keycodes they are numbers some are in sequence some are not two digits. The location in the file is constant (502,2).

    First question could I use a struct or should I just use brute force. I need it to be as fast as possible? The keycodes have 28 members so should I make an array like keycode[27] but how to I get the members associated with each location in the array? Is there an easy way to do the ones that are in sequence ie.. [20-34]..

    Any help would be great thanks again....

  2. #2
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,422
    Chances are, reading the text file by itself will be the most expensive part of the exercise (disks are really slow by comparison to processors and memory).

    So I'd go for something which is obvious - that is, it's easy to write and easy to see that its correct.

    Personally, I'd use PERL for such tasks, since it has a large number of built-in functions and data types for field extraction, pattern matching and such like.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  3. #3
    Registered User
    Join Date
    Dec 2004
    Posts
    45
    You wrote:
    I'd use PERL for such tasks,
    Well my only option is c++ 6.0 . But thanks.

    I'm new to most of this but in past programs I used a struct but I'm unsure how to applie it to this case.

    Here are the structs I'm looking at now:
    Code:
    struct Search : unary_function<string, bool>
    {
        string value;
        Search(const string& val) : value (val) {}
        bool operator()(const string& a)
        {
            
    		if (x==1)
    			return a.compare(302,2,value,0,2) == 0;
    		else
    			return a.compare(502,2,value,0,2)==0;
        }
    };
    struct compare: binary_function<string, string, bool> {
      bool operator()(const string& a, const string& b)
      {
    	
    	  return a.compare(258,12,b,258,12)<0;
    	
      }
    };
    Now at first this code was to search for phone numbers and remove repeats and do not calls. Well now the Search struct is sent the main file "a" and sent the string with a size of 2 which hold the keycodes like 01 05 10 20. The line increments to eventually search through all of the numbers. There is a problem however how should I change the compare struct so it doesn't erase repeats.
    All I want it to do is to remove the rows that have specific keycodes there are 28 of them.
    Any help would be great. Sorry for the long post

  4. #4
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,422
    C++ looking code moved to the C++ board.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  5. #5
    Registered User
    Join Date
    Dec 2004
    Posts
    45
    You wrote:

    C++ looking code moved to the C++ board.
    I don't understand your comment did you move me over to the C++ board??? If thats the case thanks but if you didn't then that is where I am and I still don't understand your statement.

    Well anyway does anyone have some answers or even suggestions on how I can get off the ground?

  6. #6
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,422
    > return a.compare(302,2,value,0,2)
    You're complicating the issue.

    1. read a line from a file
    2. use a substring function to extract a part of that line, say y = input.substring(302,2);
    3. compare that with whatever you're looking for, and act appropriately
    4. repeat.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  7. #7
    Registered User
    Join Date
    Dec 2004
    Posts
    45
    Code:
    int Disposition(string processed)
    //does the whole thing over again 
    //with a new set of codes at a different location.
    {
    	
    	ifstream in("excluded.all");//has the keycodes to be kept in 
    	string row;
    	string record;
    	ofstream out("finished.all");
    	
    	while (getline(in,row))
    	{
    	ifstream pro(processed.c_str());
    		while (getline(pro,record))
    		{
    			if ((record.substr(501,2))==(row))
    			{
    				out<<record<<endl;
    			}
    		}
    	}
    	return 0;
    }
    
    
    int Annotate ( string main)//main file
    /* Searches given file with user entered keycodes and appends
    to the user given file with the extension of .all*/
    
    
    {
    	string row;
        string line;			// for data lines
        string filename="results.all";
    	string group="group.all";
    	ofstream fout(filename.c_str());
    	ifstream in(group.c_str());
    	ifstream input(main.c_str());
    	while (getline(in,row))
    	{
    		ifstream input(main.c_str());
    		while (getline(input,line)) 
    			{
    			
    			if ((line.substr(301,2))==row)
    				fout<<line<<endl;//<<" "<<row(0,2)<<endl;
    			}
    	}
    	
    	Disposition(filename);//sends the processed results.all 
        
    	return 0;
    }
    I wrote this out and it works but, I need it to go much faster and thought a different way would be much faster. I had to change the excluded.all file from containing one I didn't want to one I wanted so that file went from 28 elements to 72 elements which causes the whole file to be run through that many times.. Any suggestions?
    Last edited by Salem; 01-20-2005 at 07:03 AM. Reason: Folding long code lines

  8. #8
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,422
    So how big are these input files?

    For example, if ifstream in(group.c_str()); is small, then I'd read the whole thing into a map (pseudocode)
    Code:
    #include <map>
    #include <string>
    
    // map a string to a bool
    std::map<std::string,bool> mymap;
    
    // read the first file into a map
    while (getline(in,row)) {
        mymap[row] = true;
    }
    
    // the process the main file ONCE as well
    while ( getline(input,line) )  {
        // test the key part, to see if it's in the map
        if ( mymap[line.substr(301,2)] ) {
            // yes, so output the line
            fout<<line<<endl;
        }
    }
    Looking things up in a map is pretty efficient, which in itself is way more efficient than scanning the file many times.

    Also, you're not closing the files when you're done with each loop.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  9. #9
    Registered User
    Join Date
    Dec 2004
    Posts
    45
    You wrote:
    So how big are these input files?
    Well the group.c_str() file contains only 4, 2 character elements. I converted the Annotate over to the map you suggested. But Disposition the "excluded.all" has like 72 elements is that too big for a map also? The main file main.c_str() can be very big 50mb and up... Right now the main file is only 361kb which sickens me that it runs this slow like two minutes!!!!! In disposition the codes run from 01-99 and there are 28 elements that can't be in the file and thus 72 elements that should be. Each line of the file(processed.c_str()) at (501,2 ) has to match one of the 72 elements to be in the finished.all file. I'll go try and implement another map for disposition.
    Thanks

  10. #10
    and the hat of wrongness Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    32,422
    > But Disposition the "excluded.all" has like 72 elements is that too big for a map also?
    No.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.
    I support http://www.ukip.org/ as the first necessary step to a free Europe.

  11. #11
    Registered User
    Join Date
    Dec 2004
    Posts
    45
    Everything worked great time cut into pieces. PsuedoCode rocks.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 2
    Last Post: 07-11-2008, 07:39 AM
  2. filling an array with a simple file
    By arya6000 in forum C Programming
    Replies: 5
    Last Post: 11-25-2007, 12:34 PM
  3. (!)Simple array exercise
    By Fizz in forum C++ Programming
    Replies: 8
    Last Post: 05-13-2004, 12:45 PM
  4. Merge sort please
    By vasanth in forum C Programming
    Replies: 2
    Last Post: 11-09-2003, 11:09 AM
  5. A simple question about selecting elements in an array
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 08-30-2001, 10:37 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21