Thread: Storing a float in 16 bits

  1. #1
    Registered User
    Join Date
    Jul 2007
    Posts
    3

    Storing a float in 16 bits

    I need to store 10 million+ floats in memory for a program I'm writing, so I wanted to be able to store them in 2 bytes instead of 4. I was wondering if there is a good way to truncate a float down to 2 bytes and if needed go up from the 2 byte representation to a float.

    Thanks!

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    It boils down to range and precision.
    Can you maintain the range and precision you want in 16 bits?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Jul 2007
    Posts
    3
    Quote Originally Posted by Salem View Post
    It boils down to range and precision.
    Can you maintain the range and precision you want in 16 bits?
    Yeah, I'm willing to sacrifice precision, and in terms of range, the values I want to store only vary from -100 to +100, so range is not a big issue

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    There are some standard 16-bit floating point values, but they are pretty limited in useability - graphics processors use them somtimes to store "floating point pixels".

    What range are your numbers? Would it be suitable to store them in a 12-bit signed integer and 4-bit fraction part?

    The following does that. Note that you loose quite a bit of precision this way, but you can't have both compact format and a lot of precision.

    Code:
    #include <stdio.h>
    #include <math.h>
    
    short ftofix16(float num) {
    
      short i, f;
      
      if (fabs(num) > 2047.999f) {
        printf("Error: number out of range (num=%f)\n", num);
      }
    
      i = (short)num;
      f = (short)(fabs(num * 16)) & 15;
      return (i << 4) | f;
    }
    
    float fix16tof(int n)
    {
      float s = 1.0f;
      if (n < 0) {
        s = -1.0f;
        n = -n;
      }
      return s * ((float)(n >> 4) + ((n & 15) / 16.0f));
    }
    
    int main(int argc, char **argv) {
      float f, g, h; 
      short a, b, c;
      for(;;) {
        scanf("%f %f %f", &f, &g, &h);
        a = ftofix16(f);
        b = ftofix16(g);
        c = ftofix16(h);
        printf("%04x, %04x, %04x\n", a, b, c);
        printf("%f, %f, %f\n", fix16tof(a), fix16tof(b), fix16tof(c));
      }
      return 0;
    }

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I wrote up the code shown above whilst replying, so I didn't know the range. With such a small number of significant digits, you could go for "8.8". The code would need to change from shifting by 4 to shifting by 8 and & 15 to & 255. The multiplication by 16 should be multiplication by 256. Otherwise same idea.

    --
    Mats

  6. #6
    Registered User
    Join Date
    Jul 2007
    Posts
    3
    Thanks mats! That was really helpful.

  7. #7
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by kara3434 View Post
    Yeah, I'm willing to sacrifice precision, and in terms of range, the values I want to store only vary from -100 to +100, so range is not a big issue
    Then use a 8.8 fixed point representation. That gives a range of the whole part from -128 to 127, and divisions of 1/256 in the fractional part.

  8. #8
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    If you would rather have a 16-bit float than a 16-bit fixed, then check out my website (link in sig). Go to the Useful classes page. You should find Shortfloat there.
    It's C++ though, but you can break it up into lots of little C functions instead.
    Last edited by iMalc; 07-31-2007 at 01:37 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Moving Average Question
    By GCNDoug in forum C Programming
    Replies: 4
    Last Post: 04-23-2007, 11:05 PM
  2. Debug Error Really Quick Question
    By GCNDoug in forum C Programming
    Replies: 1
    Last Post: 04-23-2007, 12:05 PM
  3. help me
    By warthog89 in forum C Programming
    Replies: 11
    Last Post: 09-30-2006, 08:17 AM
  4. Backdooring Instantaneous Radius of Curvature & Functions
    By just2peachy in forum C++ Programming
    Replies: 8
    Last Post: 10-06-2004, 12:25 PM
  5. Possible Loss of data
    By silicon in forum C Programming
    Replies: 3
    Last Post: 03-24-2004, 12:25 PM