Thread: comparing data in two files

  1. #1
    Registered User
    Join Date
    Oct 2008
    Posts
    92

    comparing data in two files

    Hi, I am absolutly lost with this program, I have to compare data in two files and if it matches right to 3rd file. I know it can be done by comparing data while reading from the file, or storing data in an array and then comparing it.
    I opened 1st file to compare with , and then I opened 2nd file, but do not understand how to do the actual comparasing.

    I am only up to, I'll be appriciated for help, for some explonations...

    Code:
    #include <stdio.h>
    #include<stdlib.h>
    #include<string.h>
    #define F_WANT "/want.names"
    
    
    int main(int argc, char *argv[])
    {
    
    	FILE *f1, *f2, *out;
    	char want_file[10000]; /* declaring a strings to hold contents of the files*/
    	char topps_file[1000];
    	char new_file[1000];
    	char fn[10];
    
    	int i;
    	int year; 
    	
    	/*open want.names file*/
    	
    	if ((f1=fopen(F_WANT,"r"))==NULL)
    	{
    		printf("Error open file %s.\n", F_WANT);
    		exit(1);
    	}
    	else
    	{
    		int i=0;
    		
    		while(fgets(want_file,sizeof want_file ,f1)!=NULL)   /*put contents of the want.names into want_file*/
    		{
    			fscanf(wants_file,"%*d%s",&input1, &input2);
    					
    					/*   put data into 2 dimentional array */
    
    		}
    		fclose(f1);
    	
    		
    	/*open topps.xx files using for loop to compare contents with want.name */
    		for(i=1; i<=12; ++i) 
    		{
    			year=atoi(argv[i]);
    			sprintf(fn,"topps.%2d",year);
    
    			if((f2=fopen(fn,"r"))==NULL)
    			{
    				printf("Error open file %s\n", fn);
    				exit(2);
    			}
    			else
    			{
    				int i=0;
    				while(fgets(topps_file, sizeof topps_file, f2)!=NULL)
    				{
    					fscanf(topps_file,"%s",&input);
    					topps[i] = input;
    					++i;
    
    				}
    
    				
    
    			}
    			
    		}
    
    	
    	fclose(f2);
    
    	return 0;
    	}
    }

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    That depends on what the data is. == works for everything except strings, where you should use strcmp. I can't tell whether you are trying to ignore some of the data on the line (the %*d says yes, the fact that there's a variable for it to go to says no) or which part of the data you care about and what kind of data it is.

  3. #3
    Registered User
    Join Date
    Oct 2008
    Posts
    92
    IT is a character strings, one file contains numbers and a names next to them, and the other file just a numbers, I should find same numbers and and put them into 3rd file but with a names.

  4. #4
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Then there you go. Fix your fscanf line to keep both the number and the name.

    It might not be a bad idea (in fact it is almost certainly a very very very very very good idea) to sort the data that comes in from file 1 to make finding it easier, since you will have to find who knows how many items, unless you know it is already sorted. (I would recommend something sorted as opposed to your un-implemented "put data into two-dimensional array", unless it's supposed to be a 2-D array of chars and your numbers are guaranteed to start at 0 and go sequentially, and even then.)

  5. #5
    Registered User
    Join Date
    Oct 2008
    Posts
    92
    I thought that even if it a numbers I better to read them as a characters?

    And maybe it better to compare as I read from a files, just not shure how.

  6. #6
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by nynicue View Post
    I thought that even if it a numbers I better to read them as a characters?
    It depends on how much you trust the person making the files. If you know they will not have malformed lines, then just using %d would be the easiest, and probably best, way. If you may have to deal with errors, then getline+sscanf is the way to go.
    Quote Originally Posted by nynicue View Post
    And maybe it better to compare as I read from a files, just not shure how.
    You should probably read one file (the base file) completely before starting, and then go through the target file line by line. There's no reason to be reading from both files at the same time -- especially if you have to search through one of the files (slowly).

  7. #7
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    What are you really trying to do? Compare LineX in FileA with LineX in FileB to see if they're identical, or are you trying to find out if LineX in FileA exists anywhere in FileB?

    If the former, just run fgets in a loop, reading from both files, and strcmp them both. Otherwise, you'll either be reading them both entirely into memory and comparing each elment, or you'll be working through one file while you reread the second over and over (or read the second into memory, and search that).


    Quzah.
    Hope is the first step on the road to disappointment.

  8. #8
    Registered User
    Join Date
    Oct 2008
    Posts
    92
    One file needs to be opened just once( containd list of names and some number for some n amount of years), and then I will need to go though 12 other files( each file contains updated data just for one of each year , with no names) I need to make a 3rd updated file that will look like first one , just with updated info. Taht is why I opened first file and then ws using for loop to open other 12 files one by one.

    will this work
    Code:
    while(!feof(topps_file)&&!feof(want_file))
    				{
                        if (strcmp(topps_file, want_file)==0)
    I need to take one line ( which contains year) from one of 12 files and find if it exist in first file and if it is then serach thru taht year if not then go to another file and do the same with second from 12 files.

    PS;Wow even I am not sure if I would understand myself...

  9. #9
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Assuming you know the first file isn't something insanely huge:
    Code:
    load first file into memory, such as a linked list or whatever you like
    for each file
        for each line in the file
            see if this line is in the in-memory copy of the first file
            if so, process whatever it is you're processing
    That's probably the easiest way to do it. Alternately:
    Code:
    for each line of file 1
        for each other file
            check the line from file 1 against this file's line
    That requires less storage.


    Quzah.
    Hope is the first step on the road to disappointment.

  10. #10
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    you could use fgets() and strcmp() to compare lines as you go, or you could use fgetc() to compare each character in turn, something like

    Code:
    if ( (c = fgetc(stream1)) == fgetc(stream2))
    {
      fprintf(c);
    }
    //not going to compile, just pseudocode

    but that way could cause problems depending on formatting.
    Last edited by KBriggs; 06-16-2009 at 07:21 AM.

  11. #11
    Registered User
    Join Date
    Oct 2008
    Posts
    92
    Thank you all for your help, I still not 100% sure I understood how to get this done, will work on it.
    Thanks again.

  12. #12
    Registered User
    Join Date
    Oct 2008
    Posts
    92
    Me again,
    I have aslo another problem, I can't open topps_files for reading, there is a wierd thing, I was able to open f1=fopen("/want_names","r") only with a / before want, I was trying to do the same thing with other files but it not working. WHy I have this problem, am I opening files somehow wrong?

    I also changed this code in a part where I compare files, but I can't check if it works untill problem with opening files is fixed

    Code:
    #include <stdio.h>
    #include<stdlib.h>
    #include<string.h>
    #define F_WANT "/want.names"
    
    
    int main(int argc, char *argv[])
    {
    
    	FILE *f1, *f2, *out;
    	char want_file[10000]; /* declaring a strings to hold contents of the files*/
    	char topps_file[1000];
    	char new_file[1000];
    	char fn[10];
    	char str[10];
    
    	int i;
    	int year; 
    	
    	/*open want.names file*/
    	
    	if ((f1=fopen(F_WANT,"r"))==NULL)
    	{
    		printf("Error open file %s.\n", F_WANT);
    		exit(1);
    	}
    	else
    	{
    		
    		
    		fgets(want_file,sizeof want_file ,f1);  /*put contents of the want.names into want_file*/
    		
    		fclose(f1);
    	
    		
    	/*open topps.xx files using for loop to compare contents with want.name */
    		int i=0;
    		for(i=1; i<=12; ++i) 
    		{
    			year=atoi(argv[i]);
    			sprintf(fn,"topps.%2d",year);
    
    			if((f2=fopen(fn,"r"))==NULL)
    
    				printf("Error open file %s\n", fn);
    				exit(2);
    			
    				fgets(topps_file, sizeof topps_file, f2);
    				
    	/*open file for writing*/
    				out=fopen("new_file.names", "w");
    				if(out==NULL)
    					printf("Error open file %s\n", out);
    					esit(3);
    
    				
    	/*compare and matched data write to new file*/
                    while(!feof(topps_file)||!feof(wants_file))
    				{
                        if((str=strcmp(topps_file, wants_file))==0)
    					
    						fprintf(out, "%s", str)
    				}
    
    				fclose(out);
    			}
    			
    		}
    
    	fclose(f2);
    
    	return 0;
    	}
    }

  13. #13
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    When you use fgets to read a line, it is going to leave the \n (and optionally \r, depending on OS really) on the end of the string you've just read. You'll want to trim that off.


    Quzah.
    Hope is the first step on the road to disappointment.

  14. #14
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    "/want_names" is an absolute path, meaning the file want_names must be at the very root of the drive. If the file is not at the very root of your drive, then you're not going to find it like that.

  15. #15
    Registered User
    Join Date
    Oct 2008
    Posts
    92
    Quote Originally Posted by quzah View Post
    When you use fgets to read a line, it is going to leave the \n (and optionally \r, depending on OS really) on the end of the string you've just read. You'll want to trim that off.
    Quzah.
    I tought with fgets I would put contents of the whole file into something, or I am just lost.



    Quote Originally Posted by tabstop View Post
    "/want_names" is an absolute path, meaning the file want_names must be at the very root of the drive. If the file is not at the very root of your drive, then you're not going to find it like that.
    all files I have in a directory, /want.names have no problem to open, but files taht have to be opened after.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. how to find coomon data in 3 files without rewind..
    By transgalactic2 in forum C Programming
    Replies: 31
    Last Post: 03-29-2009, 03:23 PM
  2. data structure design for data aggregation
    By George2 in forum C# Programming
    Replies: 0
    Last Post: 05-20-2008, 06:43 AM
  3. Bitmasking Problem
    By mike_g in forum C++ Programming
    Replies: 13
    Last Post: 11-08-2007, 12:24 AM
  4. added start menu crashes game
    By avgprogamerjoe in forum Game Programming
    Replies: 6
    Last Post: 08-29-2007, 01:30 PM
  5. Reading large complicated data files
    By dodzy in forum C Programming
    Replies: 16
    Last Post: 05-17-2006, 04:57 PM