-
How can I improve this?
I have the follow function that find a substr in a file. Mainly to perform lookups without loading file into memory.
Code:
typedef std::pair<long,long> FindData;
//Find position of requested substr
FindData FindSubstr(const std::string& filename, const std::string& str)
{
//Temporary xData structure to return
std::pair<long,long> Temp(-1,-1);
std::cout << "\n\nInitiating filestream for FindSubstr().\n";
std::ifstream load(filename.c_str(),std::ios::binary);
if ( !load.good() )
{
std::cout << "-->Could not open filestream. Aborting...\n\n";
return Temp;
}
std::cout << "\nSearching file for substr. Please Wait...\n";
//===========
//Find substr
//===========
size_t bytes = 0;
char byte;
std::string Mem;
while ( load.get(byte) )
{
if ( Mem.length() >= str.length() )
{
Mem.erase(0,1);
}
Mem += byte;
++bytes;
//Check for match
if ( Mem == str )
{
std::cout << "-->Match found after " << bytes << " bytes.\n\n";
Temp.first = bytes-Mem.length();
Temp.second = bytes;
return Temp;
}
}
std::cout << "-->No match found.\n\n";
return Temp;
}
It works. I was just wondering what I could do to make it faster. Not really sure what to do. On files over about 3000kb you can notice the lookup. Files over 30mb it takes forever(30 seconds approx.)
Thanks for any help or ideas.
-
Obviously, it's very inefficient to read and load one byte at a time.
-
I have to admit that the limited buffering requirement is a bit tedious but there are faster algorithms out there. You could try implementing this one or several others, depending on how this will be used.