Thread: comparing fields in a text file

  1. #1
    Registered User
    Join Date
    Aug 2003
    Posts
    93

    comparing fields in a text file

    Hi,

    I have got myself messed up....

    I am trying to read two text files which have these structures
    Code:
    typedef struct {
        char bus_tran_code   [I_BUS_TRAN_CODE_LEN];
        char upddate         [I_UPDDATE_LEN];
        char rectype         [I_RECTYPE_LEN];
        char custcode        [I_CUSTCODE_LEN];
        char usernbr         [I_USERNBR_LEN];
        char username        [I_USERNAME_LEN];
        char custname        [I_CUSTNAME_LEN];
        char nwcode          [I_NWCODE_LEN];
        char confdna         [I_CONFDNA_LEN];
        char dnaver          [I_DNAVER_LEN];
        char gtynbr          [I_GTYNBR_LEN];
        char fromdate        [I_FROMDATE_LEN];
        char todate          [I_TODATE_LEN];
        char rfsdate         [I_RFSDATE_LEN];
        char billdate        [I_BILLDATE_LEN];
        char stopdate        [I_STOPDATE_LEN];
        char invcentre       [I_INVCENTRE_LEN];
        char billflag        [I_BILLFLAG_LEN];
        char linkspeed       [I_LINKSPEED_LEN];
        char prtcoltyp       [I_PRTCOLTYP_LEN];
        char shared          [I_SHARED_LEN];
        char linkname        [I_LINKNAME_LEN];
        char zero1           [I_ZERO1_LEN];
        char stopflag        [I_STOPFLAG_LEN];
        char st_service      [I_ST_SERVICE_LEN];
        char cnxtype         [I_CNXTYPE_LEN];
        char zero2           [I_ZERO2_LEN];
        char optionflag      [I_OPTIONFLAG_LEN];
        char cpeflag         [I_CPEFLAG_LEN];
        char linkcode        [I_LINKCODE_LEN];
        char site            [I_SITE_LEN];
        char nodename        [I_NODENAME_LEN];
    
        char semicolon1      [I_SEMICOLON1_LEN];
        char separator1      [I_SEPARATOR1_LEN];
        } RecordIn1;
    
    
    typedef struct {
        char subscr_no         [I_SUBSCR_NO_LEN];
        char subscr_no_resets  [I_SUBSCR_NO_RESETS_LEN];
        char dnic              [I_DNIC_LEN];
        char dna               [I_DNA_LEN];
        char ud_user_id        [I_UD_USER_ID_LEN];
        char inv_ncc           [I_INV_NCC_LEN];
        char owner_ncc         [I_OWNER_NCC_LEN];
        char st_servtcode      [I_ST_SERVTCODE_LEN];
        char bill_date         [I_BILL_DATE_LEN];
        char stop_date         [I_STOP_DATE_LEN];
        char stop_core_date    [I_STOP_CORE_DATE_LEN];
        char order_entry_date  [I_ORDER_ENTRY_DATE_LEN];
        char rfs_date          [I_RFS_DATE_LEN];
        char equant_stop_date  [I_EQUANT_STOP_DATE_LEN];
        char activity_date     [I_ACTIVITY_DATE_LEN];
    
        char semicolon2        [I_SEMICOLON2_LEN];
        char separator2        [I_SEPARATOR2_LEN];
        } RecordIn2;
    which are dumps from database tables. I want to compare the stop date from file one with the stop date from file two connected by RecordIn1.nwcode & RecordIn1.confdna with RecordIn2.dnic & RecordIn2.dna. If the stop date from file one is not found in file two against that key, then it will be written to another file.

    I have the program opening and closing the files without error, but have come unstuck traversing through the files and checking the fields....

    Code:
       eof1 = fread(&input1, sizeof(RecordIn1), 1, p_fp[IN_FILE1]);
       if(!eof1)
       {
          extracted_records_read++;
          strncpy(input1_key, input1.nwcode, 15);
          *(input1_key+15) = NULL;
       }
       else memcpy(input1_key, HIGH_VALUE, 15);
    
       eof2 = fread(&input2, sizeof(RecordIn2), 1, p_fp[IN_FILE2]);
       if(!eof2)
       {
          exception_records_read++;
          strncpy(input2_key, input2.dnic, 14);
          *(input2_key+14) = '0';
          *(input2_key+15) = NULL;
       }
       else memcpy(input2_key, HIGH_VALUE, 15);
    
       while (!eof1 || !eof2)
       {
          if(!memcmp(input1_key, input2_key, 15))
          {
             memcpy(&output1, &input1, sizeof(RecordIn1));
             memcpy(output1.stop_date, input2.stop_date, 10);
             fwrite(&output1, sizeof(RecordIn1), 1, p_fp[OUT_FILE1]);
             extracted_records_written++;
             exception_records_written++;
    
             eof1 = fread(&input1, sizeof(RecordIn1), 1, p_fp[IN_FILE1]);
             if(!eof1)
             {
                extracted_records_read++;
                strncpy(input1_key, input1.nwcode, 15);
                *(input1_key+15) = NULL;
             }
             else memcpy(input1_key, HIGH_VALUE, 15);
    
             eof2 = fread(&input2, sizeof(RecordIn2), 1, p_fp[IN_FILE2]);
             if(!eof2)
             {
                exception_records_read++;
                strncpy(input2_key, input2.dnic, 14);
                *(input2_key+14) = '0';
                *(input2_key+15) = NULL;
             }
             else memcpy(input2_key, HIGH_VALUE, 15);
          }
          if(!compare(output1.stop_date, input2.stop_date))
             /* write out record */
       }
    unfortunately, the more I change it, the worse it gets


    tia,

  2. #2
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    What does fread() return? You should be ensuring you read the correct amount of bytes. Validate that the read has worked and populated the struct correctly (as a debugging exercise)
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  3. #3
    Registered User
    Join Date
    Aug 2003
    Posts
    93
    fread() is returning 1 in both cases

  4. #4
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    >fread() is returning 1 in both cases

    What are you expecting it to return? You tell it to read 1 item, and it tells you it read 1 item. Your if and while blocks don't execute when the reads are successful. Try something like this.
    Code:
    size_t items1 = fread(&input1, sizeof(RecordIn1), 1, p_fp[IN_FILE1]);
    if(items1 == 1)
    {
       /* success */
    }
    else
    {
       /* failure */
    }
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  5. #5
    Registered User
    Join Date
    Aug 2003
    Posts
    93
    ah - ha


    thank you Dave

  6. #6
    Registered User
    Join Date
    Aug 2003
    Posts
    93
    Hi,

    this is the output that I am querying

    /* output: */
    input 1 is 1116
    input 2 is 1116o
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    in the loop
    $

    Code:
    eof1 = fread(&input1, sizeof(RecordIn1), 1, p_fp[IN_FILE1]);
    if(eof1 == 1)
    {
      extracted_records_read++;
      strncpy(input1_key, input1.nwcode, 4);
      *(input1_key+5) = NULL;
    }
    else memcpy(input1_key, HIGH_VALUE, 4);
    
    eof2 = fread(&input2, sizeof(RecordIn2), 1, p_fp[IN_FILE2]);
    if(eof2 == 1)
    {
      exception_records_read++;
      strncpy(input2_key, input2.dnic, 4);
      *(input2_key+5) = NULL;
    }
    else memcpy(input2_key, HIGH_VALUE, 4);
    
    printf("input 1 is %s\n", input1_key);
    printf("input 2 is %s\n", input2_key);
    
       while(fscanf(p_fp[IN_FILE1], "%s", input1) != EOF)
       {
    printf("in the loop\n");
          if(memcmp(input1_key, input2_key, 4))
          {
    printf("match found\n");
             if(!strcmp(input1.stopdate, input2.stop_date) == 0)
             {
    printf("stop date 1 is %s\n", input1.stopdate);
    printf("stop date 2 is %s\n", input2.stop_date);
    I have cut the the test files down to "2 records in file 1" and "4 records in file 2" my question is, why is it iterating 16 times ?


    tia,


    p.s.
    the reason it is not making the match, is because of the extra character on the second string, haven't looked at that yet

  7. #7
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    What exactly are you trying to do here:
    Code:
    if(memcmp(input1_key, input2_key, 4))
    {
          printf("match found\n");
          if(!strcmp(input1.stopdate, input2.stop_date) == 0)
          {
                printf("stop date 1 is %s\n", input1.stopdate);
                printf("stop date 2 is %s\n", input2.stop_date);
    First off, memcpy reutrns a pointer to input1_key, so it should always get that far.
    Second, !strcmp should only equal zero if you don't have a match for your date.
    You may know that, but I was just clarifying in case you didn't.

    Also, in the code you've given us as a snippet, you never use 'input1'. Were you meaning to be changing the key values some how?
    Code:
    while(fscanf(p_fp[IN_FILE1], "%s", input1) != EOF)
    {
        printf("in the loop\n");
        ...the rest of the above...
    All you do is overwrite 'input_1key' over and over, as per the first code excerpt. Your code example is incomplete, so there is no real way for me to tell you exactly where your problem is.

    Quzah.
    Hope is the first step on the road to disappointment.

  8. #8
    Registered User
    Join Date
    Aug 2003
    Posts
    93
    Hi Quzah,

    to be fair, you have practically worked out what I want to do already, but here is an upper view,

    read two text files [ that both have a field with a date ]
    grab the date from the first record in the first file and look at all the dates in the second file
    if the date does not exist, create a third file with the record from the first file
    repeat untill all records in the first file have been read



  9. #9
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Ok, so you have two files. You want an output file which contains only the dates in the first file that are not in the second?

    There are possibly a few better ways to do it:

    Method One:
    Read the first file into memory, store it in a tree, a linked list, whatever.
    Read the second file a line at a time.

    If the date isn't in the tree, ignore it.
    If the date is in the tree, mark it as found.
    Repeat until you're done with file two.

    Go through the tree or list, and write to file any record that isn't marked as found.

    Method Two:
    Read the second file into memory, as per the method outlined above. (Tree, LL, etc.)
    Read the first file a line at a time.
    If it is found in the tree, ignore it.
    If it is not, write it to disk.

    Those are two ways you could do it. Either of the two methods would consume more memory than what you're doing, but they would be faster than:

    1. Open file 1.
    2. Open file 2.
    3. Read a line from file 1.
    4. Read a line at a time looking for a match.
    5. If no match, write the first line to a third file.
    6. If found, ignore.
    7. Rewind or close file 2.
    7. Go to step 2 or 3, depending on rewind or close, until done with file 1.

    Quzah.
    Hope is the first step on the road to disappointment.

  10. #10
    Registered User
    Join Date
    Aug 2003
    Posts
    93
    thanks Quzah,


    that qives me something to think about,

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. A development process
    By Noir in forum C Programming
    Replies: 37
    Last Post: 07-10-2011, 10:39 PM
  2. Newbie homework help
    By fossage in forum C Programming
    Replies: 3
    Last Post: 04-30-2009, 04:27 PM
  3. Post...
    By maxorator in forum C++ Programming
    Replies: 12
    Last Post: 10-11-2005, 08:39 AM
  4. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  5. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM