Thread: Compare files

  1. #1
    Registered User
    Join Date
    Mar 2004
    Posts
    9

    Smile Compare files

    Hello everyone, this is my first post.

    I need help on my assignment, hope someone can guide me.
    In short, I need to using C, compare two (text )files and see whether the contents are the same or not.
    The percentage of similarity are finally print on screen.

    For example:
    ================================================== ====
    1. enter source file
    2. enter the file you want to compare against

    Result:
    The % match is: 50%
    ================================================== ====

    I'am not asking anyone to do for me, but just some guide on how to approach.
    Thank in advance!

  2. #2
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    if( fgetc( file1 ) == fgetc( file2 ) )
        match++;
    else
        mismatch++;
    Tweak that a bit. That should suffice.

    Quzah.
    Hope is the first step on the road to disappointment.

  3. #3
    Unleashed
    Join Date
    Sep 2001
    Posts
    1,765
    Quote Originally Posted by quzah
    Code:
    if( fgetc( file1 ) == fgetc( file2 ) )
        match++;
    else
        mismatch++;
    Tweak that a bit. That should suffice.

    Quzah.
    Read characters in one at a time and compare them. For every success, incriment your accumulator/counter one notch. When the process is done, you have a number that says how many matches happened. During the comparing and matching loop, you should've been counting the number of characters in each file so that you may use your final "comparison successes" number against the total number of characters, in a bit of mathematics to figure out the percentage which appears to be required.
    The world is waiting. I must leave you now.

  4. #4
    Registered User linuxdude's Avatar
    Join Date
    Mar 2003
    Location
    Louisiana
    Posts
    926
    If you ever get bored you can make it more like diff you could look at theire source and have a little fun

  5. #5
    Registered User
    Join Date
    Mar 2004
    Posts
    9
    Thank for the above post.

    I try the code without any result:

    if( fgetc( file1 ) == fgetc( file2 ) )
    match++;
    else
    mismatch++;

    ==========================
    FILE *pFile1;
    FILE *pFile2;
    int i=0;
    int i2=0;

    pFile1= fopen("source.txt", "r");
    pFile2= fopen("Reference.txt", "r");

    if( fgetc( pFile1) == fgetc( pFile2) )
    i++;
    else
    i2++;
    =============================

    I noticed the method compare character by character, which might not be a good way.
    I think it's better to compare sentence(s).

    My code:

    char name[100] ;
    int size;

    while(!feof(pFile1))
    {

    fscanf(pFile1,"%s",name);
    size = strlen(name) + 1;

    s[i] = (char *)malloc(size*sizeof(char));
    strcpy(s[i], name);
    i++;
    }

  6. #6
    Registered User
    Join Date
    Mar 2004
    Posts
    9
    Thank for the above post.

    I try the code without any result:

    if( fgetc( file1 ) == fgetc( file2 ) )
    match++;
    else
    mismatch++;

    ==========================
    FILE *pFile1;
    FILE *pFile2;
    int i=0;
    int i2=0;

    pFile1= fopen("source.txt", "r");
    pFile2= fopen("Reference.txt", "r");

    if( fgetc( pFile1) == fgetc( pFile2) )
    i++;
    else
    i2++;
    =============================

    I noticed the method compare character by character, which might not be a good way.
    I think it's better to compare sentence(s).

    I have did some, copy the entire content but only limit to 100 words.
    Can someone suggest a better ideas?

    My code:
    char *s[100];
    char name[100] ;
    int size;

    while(!feof(pFile1))
    {

    fscanf(pFile1,"%s",name);
    size = strlen(name) + 1;

    s[i] = (char *)malloc(size*sizeof(char));
    strcpy(s[i], name);
    i++;
    }

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Please use the code tags in future.

    > while(!feof(pFile1))
    Read the programming FAQ
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Been here, done that.
    Join Date
    May 2003
    Posts
    1,164
    Quote Originally Posted by LKH
    I noticed the method compare character by character, which might not be a good way.
    I think it's better to compare sentence(s).

    My code:

    char name[100] ;
    int size;

    while(!feof(pFile1))
    {

    fscanf(pFile1,"%s",name);
    size = strlen(name) + 1;

    s[i] = (char *)malloc(size*sizeof(char));
    strcpy(s[i], name);
    i++;
    }
    And why do you think it is better to compare sentences? Your code, if it is better
    1) malloc's a 100 byte buffer every time thru the while loop (100 times thru, 10000 bytes malloc'ed and lost -- you never free them
    2) you never do the compare.
    This seems not to be a better way.


    Quote Originally Posted by LKH
    I have did some, copy the entire content but only limit to 100 words.
    Can someone suggest a better ideas?
    Yes, I suggest going back to Dave's code. It works an there is nothing wrong with it.
    Definition: Politics -- Latin, from
    poly meaning many and
    tics meaning blood sucking parasites
    -- Tom Smothers

  9. #9
    Registered User
    Join Date
    Feb 2004
    Posts
    72
    Quote Originally Posted by linuxdude
    If you ever get bored you can make it more like diff you could look at theire source and have a little fun
    I agree. Look into how diff is done. Also look into longest matching substrings and sequence alignment. Or you could take a list of all the words and see how many match, irrespective of their context.

    You can define the comparison anyway you want but if you were to compare character by character then adding a single character causes the 2 files to be considered entirely different, even though they are almost entirely the same.

    It all depends on what the contents of the text files are that you want to compare.

  10. #10
    Registered User
    Join Date
    Mar 2004
    Posts
    9
    Thank for the above replies, i'm very busy so never check the forum this few days.

    You might be right, i am wrong in the coding. (i'm very new to C programming. )

    By the way, where is Dave's code, and i can't find the code for diff, i check the link but no code is given.

  11. #11
    Obsessed with C chrismiceli's Avatar
    Join Date
    Jan 2003
    Posts
    501
    this is the location for diff; you have to be in linux though. Unless someone is kind enought to upload all the files to one of there servers temporarily. the tarball is here
    Help populate a c/c++ help irc channel
    server: irc://irc.efnet.net
    channel: #c

  12. #12
    Registered User
    Join Date
    Mar 2004
    Posts
    9
    Hi, this is my program, have to hand in this assignment tomorrow.
    Please help me by taking a look.

    Give some advice on mistake or improvement.

    Done in MS VS 6.0
    To test this program, you need a few small text files in the same directory as the project. (txt files only)

  13. #13
    Registered User linuxdude's Avatar
    Join Date
    Mar 2003
    Location
    Louisiana
    Posts
    926
    fflush(stdin);
    this is bad see faq. Do you mean fflush(stdout)? It is not good to call main recursivly use a while loop instead. What I can do is hit a key that isn't a choice over and over and crash you computer Don't use feof for a control loop. See faq. I guess you could turn it in like that if your teacher isn't that smart

  14. #14
    Registered User
    Join Date
    Mar 2004
    Posts
    9
    Thank for yor response!
    Anyone, anymore?

    By the way, my teacher very strict, he like to check every single line of code, and compare your work with other students espeically.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Create Copies of Files
    By Kanshu in forum C++ Programming
    Replies: 13
    Last Post: 05-09-2009, 07:53 AM
  2. Reading .dat files from a folder in current directory...
    By porsche911nfs in forum C++ Programming
    Replies: 7
    Last Post: 04-04-2009, 09:52 PM
  3. Working with muliple source files
    By Swarvy in forum C++ Programming
    Replies: 1
    Last Post: 10-02-2008, 08:36 AM
  4. Folding@Home Cboard team?
    By jverkoey in forum A Brief History of Cprogramming.com
    Replies: 398
    Last Post: 10-11-2005, 08:44 AM
  5. Batch file programming
    By year2038bug in forum Tech Board
    Replies: 10
    Last Post: 09-05-2005, 03:30 PM