Thread: multiple string searching woes

  1. #1
    Registered User
    Join Date
    Apr 2006
    Posts
    20

    multiple string searching woes

    I am trying to do a multiple pattern search (50 of them) in a large log file (around 300MB size) using the strstr function:

    Code:
     	while( getline( ifstreamobj , line_in_logfile ) )
    	{
    		if (strstr(line_in_logfile.c_str(), pattern1) != 0 ) ++Addcounter1 ;
    		if (strstr(line_in_logfile.c_str(), pattern2) != 0 ) ++Addcounter2 ;
    		if (strstr(line_in_logfile.c_str(), pattern3) != 0 ) ++Addcounter3 ;
    	            ...........
    		so on, upto 50 patterns
    
    	}
    I also created a script having multiple greps, like "grep -c pattern1 logfile" which does a similar function.

    I timed both the C++ program above and the script. The c++ program above took "twice the time" the script took to execute.
    I tried using fopen and fgets instead of fstreams, but only ended up getting the same results.

    Its really frustating to be humbled by a shell script
    Can someone suggest me an alternated logic in my c++ program.
    Thanks

  2. #2
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    >Its really frustating to be humbled by a shell script
    Why? grep can school strstr up and down, left and right, any day of the week.

    >Can someone suggest me an alternated logic in my c++ program.
    Pipe the script through it?
    My best code is written with the delete key.

  3. #3
    The larch
    Join Date
    May 2006
    Posts
    3,573
    Firstly you could use the string class all the way: find instead of strstr.

    Secondly you could use an array of patterns and an array of counters to save typing. (pattern[n] and counter[n] instead of patternn and countern)

  4. #4
    Registered User
    Join Date
    Mar 2006
    Posts
    725
    You're using c_str() to make huge copies of your file, and passing them around! Remember that the string class needs to copy the whole string into an array, tack on a NULL onto the end of the string THEN return it, which is a nice waste of CPU. Just use find(), like anon said.


    EDIT
    I just found out that string just returns it's own string as a const char* in my implementation, so there wasn't as much overhead as I thought. But it's worth checking out.
    Last edited by jafet; 06-07-2006 at 09:23 PM.
    Code:
    #include <stdio.h>
    
    void J(char*a){int f,i=0,c='1';for(;a[i]!='0';++i)if(i==81){
    puts(a);return;}for(;c<='9';++c){for(f=0;f<9;++f)if(a[i-i%27+i%9
    /3*3+f/3*9+f%3]==c||a[i%9+f*9]==c||a[i-i%9+f]==c)goto e;a[i]=c;J(a);a[i]
    ='0';e:;}}int main(int c,char**v){int t=0;if(c>1){for(;v[1][
    t];++t);if(t==81){J(v[1]);return 0;}}puts("sudoku [0-9]{81}");return 1;}

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ std routines
    By siavoshkc in forum C++ Programming
    Replies: 33
    Last Post: 07-28-2006, 12:13 AM
  2. lvp string...
    By Magma in forum C++ Programming
    Replies: 4
    Last Post: 02-27-2003, 12:03 AM
  3. Again Character Count, Word Count and String Search
    By client in forum C Programming
    Replies: 2
    Last Post: 05-09-2002, 11:40 AM
  4. string handling
    By lessrain in forum C Programming
    Replies: 3
    Last Post: 04-24-2002, 07:36 PM