Floating Point Underflow

**fredh** · 10-30-2019

Greetings.

I am trying to write a little program to demonstrate floating point underflow, and things aren't going as expected. First of all, my understanding of floating point underflow is as follows...

Start with the smallest possible floating point number, and divide it by 10. The exponent is already as small as possible, so the result should move the decimal point over one place.

Here is the program I wrote.

Code:

#include <stdio.h>
#include <float.h>

int main(void)
{
    double x, y;
    
    x = DBL_MIN;
    y = x / 10;
    
    printf("%e %e\n", x, y);
    
    return 0;
}

the output is 2.225074e-308 2.225074e-309

obviously the result is not what I expected. I don't understand how you can have a double number with -309 as the exponent. I thought the minimum exponent was -308

**flp1969** · 10-30-2019

Because DBL_MIN isn't the smallest value possible for double. DBL_MIN is the smallest normalized value.

2.225074e-309 is, indeed an underflow (what IEEE call it a "gracious underflow") because it is a subnormal double value (where E == 0). A little change in your program can show this:

Code:

#include <stdio.h>
#include <float.h>

struct fp_s {
  unsigned long long m:52;
  unsigned int e:11;
  unsigned int s:1;
} __attribute__((packed));

int main( void )
{
  double x, y;
  struct fp_s *px, *py;

  px = (struct fp_s *)&x;
  py = (struct fp_s *)&y;

  x = DBL_MIN;
  y = x / 10;

  // Remember: the structure for normalized values (0 < E < 2047) is:
  //    v = (-1)^S * ( 1 + M/2^52 ) * 2^(E-1023)
  // but for subnormal values (E=0) is:
  //    v = (-1)^S * ( M/2^52 ) * 2^-1022
  //
  printf( "%e (M=%#llx, E=%u, S=%u) %e (M=%#llx, E=%u, S=%u)\n",
     x, (unsigned long long)px->m, (unsigned int)px->e, (unsigned int)px->s,
     y, (unsigned long long)py->m, (unsigned int)py->e, (unsigned int)py->s );

  // THIS is the smallest subnormal value (not 0.0) possible for a double.
  // Yes... 0.0 is a subnormal special value (M=0, E=0).
  px->m=1;
  px->e=0;
  px->s=0;
  printf( "%e\n", x );

  return 0;
}

Compiling and running:

Code:

$ cc -O2 -o test test.c
$ ./test
2.225074e-308 (M=0, E=1, S=0) 2.225074e-309 (M=0x199999999999a, E=0, S=0)
4.940656e-324

Thread: Floating Point Underflow

Thread Tools

Search Thread

Display

Floating Point Underflow

Similar Threads

Floating point

floating point stack underflow

Floating Point: Stack Underflow

fixed point / floating point

Floating point faster than fixed-point

Tags for this Thread