Thread: binary file processing - fread/fwrite + buffer problems

  1. #1
    Registered User
    Join Date
    Apr 2009
    Location
    Europe
    Posts
    6

    Question binary file processing - fread/fwrite + buffer problems

    My code *somewhat* works but it still doesn't do what it's supposed to do.

    I'm trying to systematically read contents of a binary file into a 1KiB buffer, modify the buffer, and write the contents of the buffer back to the same file, continue until:
    filesize <= position - buffersize
    when that happens it means I'm at the very end of a file, so read the last part of file which should be:
    filesize - position
    long. So modify the buffersize, read in buffersize worth of data, modify it, and write it back.
    That's the theory, and now you'll get to see my crappy code. I'm sure there are lots of reasons why it doesn't work, but I came here to learn, so go ahead and tell me if you see a mistake.

    Code:
    void modifyfile(unsigned char *file) {
        char *BUFFER[1024];        // this value doesn't really make it 1024 bytes big. 
                                   // Array starts at 0, so it should be BUFFER[1023] to make it 1KiB
                                   // wrong???
        long BUFFERSIZE = (sizeof(BUFFER) / sizeof(BUFFER[0]));
        long position;
        long filesize;
        unsigned int i; 
      
        FILE *file_ptr;
    
        // open file
        unsigned long int num;
        if ((file_ptr = fopen(file, "rb+")) == NULL) {
            printf("Error opening file: %s \n", file);
            exit(1);
        }
    
        // check the filesize    
        if ((fseek(file_ptr, 0, SEEK_END)) != 0) {
            printf("Error in seek operation on file %s : errno \n", file);
            exit(1);
        }
    
        // set filesize variable    
        filesize = ftell(file_ptr);
    
        // go back
        rewind(file_ptr); 
    
        // initial position
        position = ftell(file_ptr); 
    
        // we're not at the last part of file
        while (position < (filesize - sizeof(BUFFER))) {
            // read BUFFERSIZE worth of data(should be 1KiB, but isn't)
            fread(BUFFER, sizeof(BUFFER[0]), BUFFERSIZE, file_ptr);
    
            // move the position in file BACK
            if ((fseek (file_ptr, position, SEEK_SET)) != 0) {
                printf("Error in seek operation on file %s : errno \n", file);
                exit(1);
            }
    
            // modify data, this *should*(but doesn't) let us 
            // modify each byte in buffer one by one
    	    for(i = 0 ; i < BUFFERSIZE ; i++) {
                BUFFER[i] = 0x00;    // an example - fill with zeros - 0x41 should fill with ASCII '!'
            }
    
            // write the modified buffer back to file
            fwrite(BUFFER, sizeof(BUFFER[0]), BUFFERSIZE, file_ptr); 
    
            // now remember the advanced position
            // I guess I could also use something like: 
            // position += BUFFERSIZE; 
            position = ftell(file_ptr); 
        }
    
        // we're now at the last part of file 
        if (position > (filesize - sizeof(BUFFER))) {
            // recalculate buffersize 
            BUFFERSIZE = ((filesize - position)/sizeof(BUFFER[0]));
            // read BUFFERSIZE worth of data(now it should be 1KiB or smaller)         
            fread(BUFFER, sizeof(BUFFER[0]), BUFFERSIZE, file_ptr);
    
            // move the position in file BACK
            if ((fseek (file_ptr, position, SEEK_SET)) != 0) {
                printf("Error in seek operation on file %s : errno \n", file);
                exit(1);
            }
     
            // same as before
    	    for(i = 0 ; i < BUFFERSIZE ; i++) {
                BUFFER[i] = 0x00;    // an example - fill with zeros
            }
    
            // write the modified buffer back to file
            fwrite(BUFFER, sizeof(BUFFER[0]), BUFFERSIZE, file_ptr);
        }
        fclose(file_ptr);
        printf("processed file: %s\n", file);  
        printf("file size: %db\n", filesize);  
        printf("BUFFER size: %d\n", BUFFERSIZE);  
    }
    Last edited by chain; 04-26-2009 at 06:46 PM. Reason: corrections in comments

  2. #2
    Registered User
    Join Date
    Apr 2009
    Location
    Russia
    Posts
    116
    Code:
        char *BUFFER[1024];
    to

    Code:
        char buffer[1024];

  3. #3
    Registered User
    Join Date
    Apr 2009
    Location
    Europe
    Posts
    6
    I just fixed that in my code, but didn't update code in the thread yet. Still not working...
    The thing is that the processed files are ok for the first 2048 bytes and then there's some kind of data corruption. It looks as if fseeks don't work as they should. and why does it fail on the third time data gets read/written to/from buffer?

    *update*
    After poking around with a hex editor, it turned out that after the second time the buffer gets filled it doesn't get refilled with new data, but just continues filling the file with second 1024byte of the file until it reaches the end of file. I have no idea how to fix that, simply because I don't know where's the mistake! I can't find it!
    That's in a testing version with following code removed:
    Code:
    	    for(i = 0 ; i < BUFFERSIZE ; i++) {
                BUFFER[i] = 0x00;    // an example - fill with zeros
            }
    So it should result in "rewriting" of a file. But it just gets broken!
    Last edited by chain; 04-26-2009 at 08:18 PM. Reason: update

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Code:
        // set filesize variable    
        filesize = ftell(file_ptr);
    
        // go back
        rewind(file_ptr); 
    
        // initial position
        position = ftell(file_ptr); 
    Unless you plan on changing the blue code, the red code is a complicated method of saying "position = 0;"

    Code:
     // an example - fill with zeros - 0x41 should fill with ASCII '!'
    Incorrect comment. Either change it to 0x21 or 'A'

    I'm not sure what you experience in your case, but I tried a few files, and it appears to do what I expect it to do, including setting the content to zero.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Registered User
    Join Date
    Apr 2009
    Location
    Europe
    Posts
    6
    Thanks matsp!
    I've declared position as 0 at the beginning of the function.

    Filling with zeros - or any other value was an example. If I wanted for example to XOR every byte in the file, then it's important to have the correct data in the buffer. I removed byte modifying loop to make sure that the file gets rewritten correctly. It all started working after I declared:
    Code:
    setvbuf(file_ptr, BUFFER, _IOFBF, BUFFERSIZE);
    after opening the file, but I don't really know why did it make it all work. Without setvbuf the first 1024 block of the original file was getting rewritten correctly, and after that the file was getting filled with repetitions of second 1024 block of the original file. So the first 2048 bytes of the file were ok, but the rest was just trash.

  6. #6
    Registered User
    Join Date
    Apr 2009
    Location
    Europe
    Posts
    6
    So I'm thinking and thinking about it, and I can't really understand why adding setvbuf() made my function work flawlessly, and it didn't work before. I would be very grateful if someone knowledgeable would please explain it to me. I added setvbuf() because I had a hunch that it could help, but I don't really understand HOW does it make it all work.

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I don't quite understand that either - what compiler and OS are you using?

    It may help to do an "fflush(file_ptr)", but I doubt it actually is what you need.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  8. #8
    Registered User
    Join Date
    Apr 2009
    Location
    Europe
    Posts
    6
    I'm using gcc(mingw) 3.4.5 on Windows XP SP3

    Code:
    C:\MinGW\bin>gcc --version
    gcc (GCC) 3.4.5 (mingw special)
    Copyright (C) 2004 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ ini file reader problems
    By guitarist809 in forum C++ Programming
    Replies: 7
    Last Post: 09-04-2008, 06:02 AM
  2. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  3. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 10:33 AM
  4. copying binary file
    By samc2004 in forum C Programming
    Replies: 5
    Last Post: 12-09-2003, 01:34 PM
  5. System
    By drdroid in forum C++ Programming
    Replies: 3
    Last Post: 06-28-2002, 10:12 PM