Thread: Floating Point Underflow

  1. #1
    Registered User
    Join Date
    Oct 2019
    Posts
    1

    Floating Point Underflow

    Greetings.

    I am trying to write a little program to demonstrate floating point underflow, and things aren't going as expected. First of all, my understanding of floating point underflow is as follows...

    Start with the smallest possible floating point number, and divide it by 10. The exponent is already as small as possible, so the result should move the decimal point over one place.

    Here is the program I wrote.

    Code:
    #include <stdio.h>
    #include <float.h>
    
    int main(void)
    {
        double x, y;
        
        x = DBL_MIN;
        y = x / 10;
        
        printf("%e %e\n", x, y);
        
        return 0;
    }
    the output is 2.225074e-308 2.225074e-309

    obviously the result is not what I expected. I don't understand how you can have a double number with -309 as the exponent. I thought the minimum exponent was -308

  2. #2
    Registered User
    Join Date
    Feb 2019
    Posts
    591
    Because DBL_MIN isn't the smallest value possible for double. DBL_MIN is the smallest normalized value.

    2.225074e-309 is, indeed an underflow (what IEEE call it a "gracious underflow") because it is a subnormal double value (where E == 0). A little change in your program can show this:
    Code:
    #include <stdio.h>
    #include <float.h>
    
    struct fp_s {
      unsigned long long m:52;
      unsigned int e:11;
      unsigned int s:1;
    } __attribute__((packed));
    
    int main( void )
    {
      double x, y;
      struct fp_s *px, *py;
    
      px = (struct fp_s *)&x;
      py = (struct fp_s *)&y;
    
      x = DBL_MIN;
      y = x / 10;
    
      // Remember: the structure for normalized values (0 < E < 2047) is:
      //    v = (-1)^S * ( 1 + M/2^52 ) * 2^(E-1023)
      // but for subnormal values (E=0) is:
      //    v = (-1)^S * ( M/2^52 ) * 2^-1022
      //
      printf( "%e (M=%#llx, E=%u, S=%u) %e (M=%#llx, E=%u, S=%u)\n",
         x, (unsigned long long)px->m, (unsigned int)px->e, (unsigned int)px->s,
         y, (unsigned long long)py->m, (unsigned int)py->e, (unsigned int)py->s );
    
      // THIS is the smallest subnormal value (not 0.0) possible for a double.
      // Yes... 0.0 is a subnormal special value (M=0, E=0).
      px->m=1;
      px->e=0;
      px->s=0;
      printf( "%e\n", x );
    
      return 0;
    }
    Compiling and running:
    Code:
    $ cc -O2 -o test test.c
    $ ./test
    2.225074e-308 (M=0, E=1, S=0) 2.225074e-309 (M=0x199999999999a, E=0, S=0)
    4.940656e-324
    Last edited by flp1969; 10-30-2019 at 06:50 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Floating point
    By anirban in forum C Programming
    Replies: 4
    Last Post: 08-16-2007, 07:11 AM
  2. floating point stack underflow
    By Curtux in forum C++ Programming
    Replies: 2
    Last Post: 01-25-2005, 08:02 AM
  3. Floating Point: Stack Underflow
    By niroopan in forum C++ Programming
    Replies: 1
    Last Post: 10-16-2002, 02:05 PM
  4. fixed point / floating point
    By confuted in forum Game Programming
    Replies: 4
    Last Post: 08-13-2002, 01:25 PM
  5. Floating point faster than fixed-point
    By VirtualAce in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 11-08-2001, 11:34 PM

Tags for this Thread