Thread: comparing strings in c

  1. #1
    Registered User
    Join Date
    Jan 2010
    Posts
    12

    Question comparing strings in c

    Hi, my code is trying to compare between two text files consist of strings,each contain (integer numbers delimited by commas), as follow:

    file1.txt=
    054,8100,171

    file2.txt=
    054,8100,171
    054,8100,191
    054,8100,171
    054,8100,181
    054,8100,171
    054,8100,171
    054,8100,171

    The problem that when I compile the code, it gives me wrong output(results)...

    Code:
    #include<stdio.h>
    #include<stdlib.h>
    #include<string.h>
    #include<conio.h>
    
    int main()
    {
    
    int det_strings=0,x;
    FILE *pt1,*pt2;
    pt1=fopen("file1.txt","r");
    pt2=fopen("file2.txt","r");
    
    char c1[11],c2[11]; //declaring a string to hold the content of each file
    
               fgets(c1,11,pt1); //reading the content of file1 into c1
      start: fgets(c2,11,pt2) ;//reading the content of file2 into c2
    
           if(strcmp(c1,c2)==0)
    
    
           {det_strings = det_strings + 1;}
          
    
           x = getc (pt2);
           while(x !=EOF)
    
    
                      goto start;
    
              printf("det_strings=%d\n",det_strings);
    
              getch();
              return 0;
    
             }
    Anybody would help me plz.. because I got very tired of correcting my code with no results.

    Regads..

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Well get rid of the getc() and the goto.
    The getc is mucking up your input.

    Code:
    while ( fgets(c2,sizeof c2,pt2) != NULL ) {
      // do something with each line
    }
    Will read the whole file, one line at a time.

    Also, 11 isn't enough characters for
    "054,8100,171\n\0"
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Jan 2010
    Posts
    12

    To you dear...

    Thanks broth. salem...
    I attempted the correction you sent me on the code since before, and many else, but I also got wrong output..
    as I'll show you..

    Code:
    int main()
    {
    
    int det_strings=0,x;
    FILE *pt1,*pt2;
    pt1=fopen("file1.txt","r");
    pt2=fopen("file2.txt","r");
    
    char c1[11],c2[11]; //declaring a string to hold the content of each file
    
               fgets(c1,11,pt1);   //reading the content of file1 into c1
      start: while(fgets(c2,11,pt2)!=feof);   //reading the content of file2 into c2
        {
            if(strcmp(c1,c2)==0)
    
    
               {det_strings = det_strings + 1;}
    
    
             goto start;
         }
    
              printf("det_strings=%d\n",det_strings);
    
              getch();
              return 0;
    
             }
    But the messages given the compiler are:
    1-nonportable pointer comparison.
    2-unreachable code.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Did my code have a goto?
    Did my code have a ; at the end of the while loop?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Jan 2010
    Location
    Germany, Hannover
    Posts
    15
    Code:
    #include<stdio.h>
    void pError(char * s){
       printf("%s\n",s);
       system("pause");
       exit(1);
    }
    
    int main(){
    
    int det_strings=0;
    FILE *pt1,*pt2;
    pt1=fopen("file1.txt","r");
    if (pt1==NULL)
       pError("open error f1");
    pt2=fopen("file2.txt","r");
    if (pt2==NULL)
       pError("open error f2");
    
    short bEof2=0;
    char c1[20],c2[20]; //declaring a string to hold the content of each file
    int buf_size=sizeof(c1);
    
    while(((fgets(c1,buf_size,pt1))!= NULL)&&(!bEof2)){
      printf("f1: %s\n",c1);
       if (!bEof2)   {
          if (fgets(c2,buf_size,pt2)==NULL)
             bEof2=1;
          else {
             printf("f2: %s\n",c1);
             if(strcmp(c1,c2)==0) 
               det_strings++;
          }
       }
    }
    printf("equal lines=%d\n",det_strings);
    system("pause");
    return 0;
    }
    i left 2 printf(...) lines in the code, so one could better follow, which file is currently read.
    now the program only counts the equal lines from the start and stops, when the first file is at eof...

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    I thought
    - read ONE line from file 1

    - compare with all the lines of file 2

    fgets (for file 1 )
    while ( fgets( for file 2) )
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Registered User
    Join Date
    Jan 2010
    Posts
    12
    Dear Kermitaner...
    I appreciate your efforts in solving my problem..
    actualy, I tested your code many times, it was good enough to compare the 1st. line of file1 with the 1st. line of file2, and no more..
    also when I press any key to continue.. it's not.
    Indeed, I want to compare the 1st. line of file1 with all lines of file2 subsequently..
    and I just want to get the No. of matched strings in the end.

    I appreciate your help, and I hope you continue with me to solve that problem

  8. #8
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Adel, Hi! Welcome to the forum.

    Could I interest you in doing this 100x faster, and easier, too?

    All we have to do is to sort the lines in file2, one time. After that, the searches for all the matching strings is trivial to program.

    And I have a great sorter that is so fast it'll knock your socks off!

    It's true that it won't speed up some trivial amount of searches in a very small file, *but*, give it a large file in file 1 and file 2, and you'll really see the 100x faster claim I made, come true.

    Interested? Here's a little taste.

    Code:
    //8888888  Your old code and some new (untested!) code 88888888888888
    #include<stdio.h>
    #include<stdlib.h>
    #include<string.h>
    #include<conio.h>
    
    int main()
    {
    
      int i, det_strings=0,x;
      FILE *pt1,*pt2;
    
      char c1[80],c2[500][80]; //declaring a string to hold the content of each file
    
      pt1=fopen("file2.txt","rt");
      pt2=fopen("temp.txt","wt");
      
      //load file2 lines into the c2 array
      i = 0;
      while((fgets(c2[i], sizeof c2[i], pt1) != NULL)
         ++i;
      --i; //adjust i one time
      //and sort the array of lines
      quicksort(c2, 0, i);
    
    
      printf("\n\n\t\t\t     press enter when ready");
      i = getchar();
      return 0;
    }
    
    void quicksort(char c2[500][80], int l, int r) {  
      int i, j, pivot, temp;
    
      if(l == r) return; 
      i=l; 
      j=r;
      pivot= A[(l+r)/2]; 
    
      /* Split the array into two parts */
      do {    
        while (A[i] < pivot) i++;   //all comparison lines will be changed to handle strings
        while (A[j] > pivot) j--;
        if (i<=j) {
          temp= A[i];                   //assignments will be change to strcpy()
          A[i]= A[j];
          A[j]=temp;
          i++;
          j--;
        }
      } while (i<=j);
        
      if (l < j) quicksort(A, l, j);
      if (i < r) quicksort(A, i, r);
    }
    
    
    //888888888888888888888888888888888  Your old code 88888888888888888
    //8888888888 Save as some will be used 888888888888888888888888888888
    
      fgets(c1,11,pt1); //reading the content of file1 into c1
      start: fgets(c2,11,pt2) ;//reading the content of file2 into c2
    
           if(strcmp(c1,c2)==0)
    
    
           {det_strings = det_strings + 1;}
          
    
           x = getc (pt2);
           while(x !=EOF)
    
    
                      goto start;
    
              printf("det_strings=%d\n",det_strings);
    
              getch();
              return 0;
    
             }
    This is really for me. When I see someone about to loop needlessly through a file, over and over, it's just a downer.

    Smarter is fun!

    When this is done, the program will use a fast searcher, and you'll learn a lot about Quicksort, and searching, as well.
    Last edited by Adak; 01-25-2010 at 11:35 AM.

  9. #9
    Registered User
    Join Date
    Jan 2010
    Location
    Germany, Hannover
    Posts
    15
    hello,

    the method Adak showed, is of course a better solution.
    however for practicing reasons u should try to change the code yourself, first put the reading loop of the second file in the outer loop and open&read &close the file 2# after each line read from file #1.
    if the program gives the expected results u could start to optimize it, maybe read the 2.nd file into an array and then compare each line with the strings from the array because comparing strings from memory is much faster than reading from disk.

    then u could add a counter to the array and put only different lines in it, counting the occurence of each duplicate line in file #2...

    there are many interesting aspects even in this simple exercise

  10. #10
    Registered User
    Join Date
    Jan 2010
    Posts
    12
    Dear Adak..
    Hello, I'm realy glad being a member in your great forum, and I'm too glad recieving many answers from many members, solving my problem..
    Moreover, I wuold like to thank you so much for your help.. and tell you that I tested your code many times but in each time I use it, there was many errors.I tried to debug them but I still got errors, I'm not sure, is it due to the difference of compilers we used (my compiler is Borland C++), or due to bugs within the code itself I cuoldn't catch..as the following:
    1- quicksort <--- call to undefined function.
    2- Undefined symbol 'A'.
    3- 'temp' is declared but never used.
    4- start: fgets(c2,11,pt2) ; <--- Declration syntax error.

    I'm realy too sorry disturbe you again, but you acceptd me as a new member in your forum and offer help , and I still required your assistance.

    my regards...

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    There's a difference between C and C++. I wrote that code on Borland's Turbo C/C++ but ONLY using the C compiler - not the C++ compiler.

    You can ignore the "temp is declared but never used" warning - that just means it's not polished code, and hasn't been gone through to remove anything I was "thinking" about using for a variable, but found out I didn't need it.

    I have more time right now, than I had then, let me run the code, and see what's up. It was meant as "idea" code, more than "finished" code. That's why I called it untested. My idea was that you'd take it and integrate what you wanted from it, into your program.

    Under "Options" ===>> Compiler ===>> can you see what your setting are? Be sure that your compiler options are using the C compiler for any program with a name having a c extension on it, OK? C++ programs should use a cpp file extension, by default.

    And I will go see what's up with my code.

    Oh! Although I pasted in some code from my files, the majority of it was just written off the cuff, in the forum editor. I don't have that program on my computer.

    I'll bet it is rough, then!

    I'll post up the revised version, later this evening. It's got a few rough spots, yet.
    Last edited by Adak; 01-28-2010 at 06:41 PM.

  12. #12
    Registered User
    Join Date
    Jan 2010
    Posts
    12

    compare strings in txt files.c

    Hi every body..
    I've corrected my code in which I would like to compare strings of two txt files;

    file1.txt=
    054,8100,171
    054,8100,191


    file2.txt=
    054,8100,171
    054,8100,191
    054,8100,171
    054,8100,181
    054,8100,171
    054,8100,171
    054,8100,171

    to be as following:
    Code:
    #include<stdio.h>
    #include<stdlib.h>
    #include<string.h>
    #include<conio.h>
    
    void pError(char * s){
       printf("%s\n",s);
    }
    
    int main(){
    
    int equal_lines=0;
    FILE *pt1,*pt2;
    pt1=fopen("file1.txt","r");
    if (pt1==NULL)
       pError("open error f1");
    pt2=fopen("file2.txt","r");
    if (pt2==NULL)
       pError("open error f2");
    
    short q=0;
    char c1[50],c2[50]; //declaring a string to hold the content of each file.
    int buf_size=sizeof(c1);
    
    fgets(c1,buf_size,pt1);
    while(((fgets(c2,buf_size,pt2))!= NULL)&&(!q))
       {
            printf("f1: %s\n",c1);
            printf("f2: %s\n",c2);
    
    
             if(strcmp(c1,c2)==0)
               equal_lines++;
    
    
       }
    
    printf("\n\n equal_lines=%d",equal_lines);
    getch();
    return 0;
    }
    untill this point, it's perfectly achieve comparing 1st string of file1 with all strings of file2...
    but I know want to compare the 2nd string of file1 with all string of file2.. and so on consecutively..
    any body help me plz.

  13. #13
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Hi Adel. Yes, we lost a few posts in the last forum crash we just had.

    Here's the last version of the program: It works as you'd suspect. The big data file of strings is sorted first, but just once.

    That allows a much faster binary search to be used, thereafter.

    Then the smaller file is read in one line at a time in the "main loop", and each string is tested, continuing until no more lines remain in the smaller file.




    I'll have to test this to see how it works. I have some more code that was to be added to this program, but never made it. I'll add it on later today.

    I added a counting feature to this program, I'm not sure you need that, but it's there, just in case.


    Code:
    /* sorts strings in a file, counts the number of those
    strings which match the strings from another data file.
    
    status: just a start
    
    */
    
    #include<stdio.h>
    #include<stdlib.h>
    #include<string.h>
    
    #define ROW 1000
    #define COL 15
    
    int binarySearch(char c2[ROW][COL], char c1[COL], int i);
    int compare( const void *a, const void *b);
    int getC2(char c2[ROW][COL], FILE *fpt2);
    void quicksort(char c2[ROW][COL], int l, int r);   
    
    int main() {
    
      int i, j, found, up, down, index, match;
      FILE *fpt1,*fpt2;
    
      char c1[COL]; //declaring a string to hold the content of each file
      char c2[ROW][COL];
    
      printf("\n\n\n\n\n    New Run \n\n");
      i = ROW;
    
      fpt1=fopen("file1.txt","rt");
      fpt2=fopen("file2.txt","rt");
      
      if((fpt1 == NULL) || fpt2 == NULL) {
        printf("\n Error opening file1.txt or file2.txt - terminating");
        return 0;
      }
      i = getC2(c2, fpt2); 
    
      //and sort the array of lines
      printf("\n\n  Sorting - Standby \n");
      i = ROW-1;
    
      qsort((void *)c2, ROW, sizeof(c2[0]), compare);
    
      for(i = 0; i < ROW; i++) {
        printf("%3d: %s", i, c2[i]);
        if(i % 20 == 0 && i)
          up = getchar();
      }
      up = getchar();
    
      //main processing loop
      putchar('\n'); 
      while((fgets(c1, COL, fpt1)) != NULL) {
         index = binarySearch(c2, c1, i);
    
         if(index > -1) {
           up = down = index;
           while((strcmp(c2[up], c1)) == 0)
             up--;
    
           while((strcmp(c2[down], c1)) == 0)
             down++;
           match = down - up;
           printf("%d matches of %s", match, c2[index]);
         }
      }
    
      fclose(fpt1);
      fclose(fpt2);
    
      printf("\n\n\t\t\t     press enter when ready");
      i = getchar(); ++j; ++up;
      return 0;
    }
    int binarySearch(char c2[ROW][COL], char c1[COL], int i) {
      int lo, mid, hi;
      lo = 0; 
      hi = i -1;
    
      while(lo <= hi) {
        mid = (lo + hi) / 2;
        if((strcmp(c2[mid], c1)) > 0)
          hi = mid - 1;
        else if((strcmp(c2[mid], c1)) < 0)
          lo = mid + 1;
        else
          return mid; //found, return index
      }
      return -1;      //not found
    }
    
    int getC2(char c2[ROW][COL], FILE * fpt2) {
      int i;
      //load file2 lines into the c2 array
      i = 0;
      while((fgets(c2[i], COL, fpt2)) != NULL) {
         ++i;
         if(i == ROW) break;
      }
      --i; //adjust i one time
      return i;
    }
    
    int compare( const void *a, const void *b)
    {
       return( strcmp(a,b) );
    }
    
    
    /* boneyard now! 
    
    void swap (char ** a, char ** b)
    {
     char * tmp = *a;
     *a=*b;
     *b=tmp;
    }
    
    void printStrings(char * strings[])
    {
    	int i;
    	
    	for (i=0;i<STRING_COUNT;i++)
    	{
    		printf("%d: %s\n",i,strings[i]);
    	}
    }
    */

  14. #14
    Registered User
    Join Date
    Jan 2010
    Posts
    12
    Dear adak..
    Hi, how are u..!!??
    I've copied your new code, and I'll test it soon.
    But plz. try to test my code, and send me your opinion about comparing the 2nd string of the 1st. file with all strings of file2.

  15. #15
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by adel View Post
    Dear adak..
    Hi, how are u..!!??
    I've copied your new code, and I'll test it soon.
    But plz. try to test my code, and send me your opinion about comparing the 2nd string of the 1st. file with all strings of file2.
    Howdy Adel.

    I've strained my elbow badly, keyboard usage is limited severely. This is the final version of how I'd do this kind of work.

    Intelligent indexes would be faster, but not worth the trouble for anything but massive searches, where run time is critical.

    This is very fast - you'll see.

    Your code can't work because it doesn't loop back to get another string from file1.txt. You need a nested loop to do that - I mean an outer loop, enclosing a second complete loop, inside it. Like this:

    Code:
    While(there is a line to get in file1) {
        get that line from file1
        While there is a line to get in file2) {
            get that line from file2
            compare that line with the line from file1
            count it as a match
         loop   
         print up how many matches were found for that line, and the line itself
    loop
    The problem with the above algorithm is it's *very* SLOW, and requires a lot of attention from the hard drive.

    It will work however. Much faster if you can make the array in the program below, as large as you can. Hopefully, large enough to hold all the strings you need to search for, and through.

    You may be able to malloc a larger array than you can can declare the way I have, below. Something else to check into.

    Another way to do this is:

    1) sort and write out all the strings you'll be searching for (file1.txt)
    2) sort and write out all the strings you'll be searching through, (file2.txt)

    Now just work through the file, input string1 and search through file2 for it's match, just sequentially, until you reach the first file2 string > your file1 string. You know it can't be found in the rest of the file, so start searching for the next string in file1, and do the same.

    This is slower than the program below, but not horrible, (not nearly as bad as the first algorithm I described), since you can take advantage of both files' strings, being in sorted order.

    Code:
    /* sorts strings in a file, counts the number of those
    strings which match the strings from another data file.
    
    status: ok, but not thoroughly tested
    
    */
    
    #include<stdio.h>
    #include<stdlib.h>
    #include<string.h>
    
    #define ROW 1793
    #define COL 15
    
    int binarySearch(char c2[ROW][COL], char c1[COL], int i);
    int compare( const void *a, const void *b);
    int getC2(char c2[ROW][COL], FILE *fpt2);
    void quicksort(char c2[ROW][COL], int l, int r);   
    
    int main() {
    
      int i, j, found, up, down, index, match;
      FILE *fpt1,*fpt2;
    
      char c1[COL];          //strings for file1.txt (c1) and file2.txt (c2)
      char c2[ROW][COL];
    
      printf("\n\n\n\n\n    New Run \n\n");
      i = ROW;
    
      fpt1=fopen("file1.txt","rt");
      fpt2=fopen("file2.txt","rt");
      
      if((fpt1 == NULL) || fpt2 == NULL) {
        printf("\n Error opening file1.txt or file2.txt - terminating");
        return 0;
      }
      i = getC2(c2, fpt2); 
    
      //and sort the array of lines
      printf("\n\n  Sorting - Standby \n");
      i = ROW-1;
      qsort((void *)c2, ROW, sizeof(c2[0]), compare);
    
    /* shows the sorted strings in the array */
    /*
      for(i = 0; i < ROW; i++) {
        printf("%3d: %s", i, c2[i]);
        if(i % 20 == 0 && i)
          up = getchar();
      }
    */
      
      //printf("\n\n %d", strcmp(c2[9], c2[10]));
      //main processing loop
      putchar('\n'); 
      while((fgets(c1, COL, fpt1)) != NULL) {
         index = binarySearch(c2, c1, i);
    
         if(index > -1) {
           up = down = index;
           while((strcmp(c2[up], c1)) == 0)
             up--;
    
           while((strcmp(c2[down], c1)) == 0)
             down++;
           match = down - up;
           printf("%d matches of %s", match, c1);
         }
         else
           printf("0 matches of %s", c1);
      }
    
      fclose(fpt1);
      fclose(fpt2);
    
      printf("\n\n\t\t\t     press enter when ready");
      i = getchar(); ++j; ++up;
      return 0;
    }
    int binarySearch(char c2[ROW][COL], char c1[COL], int i) {
      int lo, mid, hi;
      lo = 0; 
      hi = i -1;
    
      while(lo <= hi) {
        mid = (lo + hi) / 2;
        if((strcmp(c2[mid], c1)) > 0)
          hi = mid - 1;
        else if((strcmp(c2[mid], c1)) < 0)
          lo = mid + 1;
        else
          return mid; //found, return index
      }
      return -1;      //not found
    }
    
    int getC2(char c2[ROW][COL], FILE * fpt2) {
      int i;
      //load file2 lines into the c2 array
      i = 0;
      while((fgets(c2[i], COL, fpt2)) != NULL) {
         ++i;
         if(i == ROW) break;
      }
      --i; //adjust i one time
      return i;
    }
    
    int compare( const void *a, const void *b)
    {
       return( strcmp(a,b) );
    }
    
    
    /* Notes:  
    Strings below are the contents of file1.txt,
    (the small file). file2.txt had 1793 lines 
    in it, made up of multiple copies of this file, 
    along with original variations of this string.
    
    
    042,0101,182
    042,0101,181
    042,0101,183
    042,0101,184
    042,0101,185
    054,0101,194
    054,0101,191
    054,0101,170
    054,0101,193
    054,0101,172
    054,0101,170
    000,test,000  << unique testing string
    054,0101,174
    054,0101,192
    
    */
    If you don't like this program for your needs, then use the algorithm I highlighted in blue, above. It will work, and is much better than your original, but still similar.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. comparing 2 strings with pointer
    By meriororen in forum C Programming
    Replies: 9
    Last Post: 05-22-2009, 07:37 PM
  2. Replies: 2
    Last Post: 04-29-2009, 10:13 AM
  3. Problem with comparing strings!
    By adrian2009 in forum C Programming
    Replies: 2
    Last Post: 02-28-2009, 10:44 PM
  4. comparing strings
    By infinitum in forum C++ Programming
    Replies: 1
    Last Post: 05-03-2003, 12:10 PM
  5. Comparing Strings
    By Perica in forum C++ Programming
    Replies: 6
    Last Post: 02-12-2003, 11:41 PM