Thread: writing to binary

  1. #1
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168

    writing to binary

    is there a difference between writing to a binary file, 1 number at a time, as opposed to lets say 1000 numbers at a time?

    is there time used up just by using the fwrite command?

    I have a big file to write and at the moment it takes 20 days to write. With a preliminary code and writing one file at a time with no multi threading.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Of course there's a difference.

    Each call to fwrite() will have an overhead, so if you write a number at a time or 1000 numbers will make a noticable difference. If you have a huge amount of data, you may want to use an even bigger number than 1000.

    --
    Mats

  3. #3
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    ok so the number or numbers in this array that i will write will be limited by my RAM correct.

    Im using integers so 16 bits = 2 bytes per number.

    How much extra space do i have to allow to keep my computer from crashing?
    (Windows XP Pro)

    I mean, i have 2GB RAM lets say, so i need to leave X MB for windows

  4. #4
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    A standard integer is 4 bytes. A short int only uses 2. If your program is crashing and you are using arrays, then are you sure you arent blowing the stack?

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    How much memory Windows itself uses depends on so many things that it's completely impossible to give you a fixed number that will work at all times on different systems and under different circumstances. As a rough guideline, you will need to leave something like
    Code:
    64 * 1024 * 1024 + (8 * memorysize_in_bytes / 4096) * number_of_apps_running
    That is in bytes.

    The long calculation is to figure out how much memory the page-tables take up, the 64MB is just a guestimate of how much memory kernel, a few drivers, etc is using up.

    Note also that the amount of data you can write with fwrite is definitely not limited by the amount of ram in your machine, I could easily fill any size disk with this:
    Code:
    int main(void)
    {
       FILE *f = fopen("something.bin", "wb");
       __int64 buf[1000];
       __int64 n = 0;
       for(;;) {
          for(i = 0; i < 1000; i++)
              buf[i] = n+i;
          n += 1000;
          if (fwrite(f, buf, sizeof(buf), 1) <= 0)
            break;
       }
       fclose(f);
    }
    [No, it isn't complete, nor does it good error checking]

    --
    Mats

  6. #6
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    my program isnt crashing it just runs to slowly.

    oh and i meant i will have to make an array to be written by fwrite.
    So Is the size of the array is limited by my ram?
    Last edited by chico1st; 08-16-2007 at 10:28 AM.

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Let's get one thing clear, as you increase the size of data you send to fwrite, the overhead for calling fwrite is reduced. Say we do this:
    fwrite(f, b, 1, s); overhead = 1
    fwrite(f, b, 10, s); overhead = 0.1
    fwrite(f, b, 100, s); overhead = 0.01

    As you can see, the overhead per item written reduces quite quickly.

    Eventuallly, the difference between calling fwrite with one number and another larger number will be so small that you can't tell the difference. At which point making your storage array larger will not help much.

    I'd say something like 32 or 64KB is large enough to not make any difference in overhead - but by all means, make some measurement of the performance for various sizes of data [if you use a named constant for the number of items per fwrite, you can quite easily change the size from one mumber to another]

    --
    Mats

  8. #8
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    hmmm just by counting with my metronome i can see:
    significant improvements going from 1000 to 10 000 (x2 speed)
    negligible improvements going from 10 000 to 100 000 (x12/11 = x1.091)
    and the program crashes at 1 000 000

    so ill stop at 10 000 elements per column, with 20 columns.

    sweet thanks man.

  9. #9
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by chico1st View Post
    is there a difference between writing to a binary file, 1 number at a time, as opposed to lets say 1000 numbers at a time?
    It doesn't matter much. The entire point of a buffered I/O layer like stdio is to make it efficient to do small reads and writes. The library buffering does take time, but not nearly as much as if you'd called the operating system directly a bunch of times with a bunch of small chunks of data.

    Suppose you wanted to write 1000 objects at a time instead of 1. So you write some code which buffers 1000 objects in a buffer and then calls fwrite() once. But what you've done is basically duplicated what fwrite() would have done anyway.

    Trust the stdio buffering. Do your writes however is most convenient. If it's still too slow, there may be problems elsewhere in the code.

  10. #10
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    :S i duno... it takes about 10x as long for me to use 20 000 000 fwrites with one number as opposed to 2000 fwrites with 10 000 numbers

    i duno completely understand why but i was just testing it roughly.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by chico1st View Post
    :S i duno... it takes about 10x as long for me to use 20 000 000 fwrites with one number as opposed to 2000 fwrites with 10 000 numbers
    That's probably due to cache considerations, not anything intrinsically wrong with fwrite(). I'd be curious to see a profile...

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by brewbuck View Post
    It doesn't matter much. The entire point of a buffered I/O layer like stdio is to make it efficient to do small reads and writes. The library buffering does take time, but not nearly as much as if you'd called the operating system directly a bunch of times with a bunch of small chunks of data.

    Suppose you wanted to write 1000 objects at a time instead of 1. So you write some code which buffers 1000 objects in a buffer and then calls fwrite() once. But what you've done is basically duplicated what fwrite() would have done anyway.

    Trust the stdio buffering. Do your writes however is most convenient. If it's still too slow, there may be problems elsewhere in the code.
    It is indeed buffered, but in this case, we're talking about writing two bytes (1 short integer) versus writing a large number of bytes - the time it takes to put 2 bytes into a the buffer is most likely very small, compared to the time it takes to push 4 parameters to the stack [fwrite(f, buffer, size, count) is four parameters] and calling the function itself, along with whatever goes into the "fwrite" function itself.

    --
    Mats

  13. #13
    Registered User
    Join Date
    Jul 2006
    Posts
    162
    Code:
    for(;;)
    gross!

  14. #14
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by simpleid View Post
    Code:
    for(;;)
    gross!
    That's how I always write an unconditional loop. It's typical.

  15. #15
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by simpleid View Post
    Code:
    for(;;)
    gross!
    And how do you propose making "infinit" loops?


    Yes, I learnt that a long time ago for use in the "forever" loop in the process in a real-time OS (a process in this RTOS isn't meant to end itself, in fact if it ever returns, it will throw an error about "process ended unexpectedly"), and whenever I want to do something "forever", I use that. It's no longer to write than "while(1)" and shorter than "do ... while(1);"

    Of course, I could have written:

    Code:
    ....
    a:
    ...
    goto a;
    It wold have been slightly longer, and clearly not "good style".

    --
    Mats

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Writing std::string in binary mode
    By VirtualAce in forum C++ Programming
    Replies: 16
    Last Post: 11-06-2010, 07:39 PM
  2. Writing a struct to a binary file
    By Scarvenger in forum C++ Programming
    Replies: 8
    Last Post: 09-12-2006, 01:50 AM
  3. Writing binary data to a file (bits).
    By OOPboredom in forum C Programming
    Replies: 2
    Last Post: 04-05-2004, 03:53 PM
  4. binary search and search using binary tree
    By Micko in forum C++ Programming
    Replies: 9
    Last Post: 03-18-2004, 10:18 AM
  5. binary to decimal
    By miryellis in forum C Programming
    Replies: 7
    Last Post: 03-14-2004, 08:35 PM