Thread: efficient binary file read

  1. #1
    Registered User
    Join Date
    Apr 2007
    Posts
    17

    efficient binary file read

    Hello

    I would like to read a binary file contain short (16 bit) ints into an array. However, I would like that
    array to be of type float (32 bit). Presently I am simply reading the file into an int array using fread
    and then doing a loop to assign the short ints to another array that is of type float. The problem is that
    the arrays can be rather large and I would like to be memory efficient by avoiding allocating memory
    for the both an int and a float array. Is there a straighforward way to read this file directly and quickly
    into the float array?

    Thanks

  2. #2
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    1) Have one short int.
    2) Have one float array.
    3) Have a loop where a short int is read, converted to a float and placed inside the float array.

  3. #3
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    In fact, 3 should be a simple assignment. Just store it directly into the float array.

  4. #4
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    You cannot read a 16-bit int directly as a float. You can't even read a 32-bit int directly as a 32-bit float. Floats in binary are usually represented in some very different fashion than ints.

  5. #5
    Registered User
    Join Date
    Apr 2007
    Posts
    17
    Thanks for the replies

    But these files are big. Reading an int at a time and making the assignment to the elements of a float array will take a lot more time than using fread to read the big block of data. Wont it?

  6. #6
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    Are you allowed to rewrite the file? Cause if reading it that way is a problem, then you should store floating point numbers in the file as well. Then you can just read them into memory at once.

    If not, then what MacGyver suggested is as good as it's going to get it seems.

  7. #7
    Registered User
    Join Date
    Oct 2001
    Posts
    2,934
    Quote Originally Posted by Marv View Post
    The problem is that the arrays can be rather large and I would like to be memory efficient by avoiding allocating memory for the both an int and a float array.
    What you could do is read a block of ints at a time. That way your int array doesn't have to be so large.

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    #define SIZE ( BUFSIZ / sizeof short int )

    short int inputBuffer[ SIZE ];

    while ( (n=fread(inputBuffer, sizeof short int, SIZE, fp )) > 0 )

    Now convert 'n' shorts into floats
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Registered User
    Join Date
    Apr 2007
    Posts
    17
    Those have all been good suggestions. Unfortunately I do not have control (or little control) over how the file is initially written.

    I think I will follow Swoopy and Salem's idea about reading a good size block of int with fread and then doing an assignment to the large float array (repeat and rinse). That is unless there is some
    other clever way one of you might introduce me to.

  10. #10
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    What they suggested is the correct way of doing it unless you are under such severe memory restrictions that you can't afford to make the other array (which seems unlikely).
    Last edited by MacGyver; 04-12-2007 at 12:13 AM.

  11. #11
    Registered User
    Join Date
    Apr 2007
    Posts
    17
    Thanks for the help everyone.

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by swoopy View Post
    What you could do is read a block of ints at a time. That way your int array doesn't have to be so large.
    Wow, that's almost like... what stdio does!

  13. #13
    Registered User
    Join Date
    Oct 2007
    Posts
    9
    Quote Originally Posted by MacGyver View Post
    You cannot read a 16-bit int directly as a float. You can't even read a 32-bit int directly as a 32-bit float. Floats in binary are usually represented in some very different fashion than ints.
    Hi, I'd like to ask that how are the floats presented differently? I've experimented a little, and as a result the binary file made looks like this:
    (Opened with binary editor)

    1F 85 AB 3F AE 47 2D 42 CD 0C A6 43 D9 2A 13 45 ...?.G-B...C.*.E

    That was the first line, I want to understand how this works. I kinda understood what happens with ints, but this is somewhat peculiar.

  14. #14
    Chinese pâté foxman's Avatar
    Join Date
    Jul 2007
    Location
    Canada
    Posts
    404
    You might want to take a look at this:
    http://en.wikipedia.org/wiki/IEEE_75...ecision_32_bit

  15. #15
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    There is obviously no way to know whether those numbers are integers, floats or some other data type(s). It would be possible, if we know that it MAY be float to make some searches for certain patterns and say "no, this is unlikely to be a float" or "it's likely that it is a sequence of float", but you can't know for certain.

    I don't see anything indicating your numbers CAN'T be float - but they could just as well be anything else.

    The difference between float and integer types is that the floating point numbers are "split into portions", a standard 32-bit float has one 1 bit of sign (bit 31) followed by 8 bits of exponent, and the remaining 23 bits reprsent the mantissa.

    The number is essentially s*1.m*2^e (s = sign = 1 or -1).

    More info here:
    http://en.wikipedia.org/wiki/IEEE_754

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Newbie homework help
    By fossage in forum C Programming
    Replies: 3
    Last Post: 04-30-2009, 04:27 PM
  2. Read and write binary file?
    By Loic in forum C++ Programming
    Replies: 2
    Last Post: 10-29-2008, 05:31 PM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. Request for comments
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 15
    Last Post: 01-02-2004, 10:33 AM