-
Floating Point Underflow
Greetings.
I am trying to write a little program to demonstrate floating point underflow, and things aren't going as expected. First of all, my understanding of floating point underflow is as follows...
Start with the smallest possible floating point number, and divide it by 10. The exponent is already as small as possible, so the result should move the decimal point over one place.
Here is the program I wrote.
Code:
#include <stdio.h>
#include <float.h>
int main(void)
{
double x, y;
x = DBL_MIN;
y = x / 10;
printf("%e %e\n", x, y);
return 0;
}
the output is 2.225074e-308 2.225074e-309
obviously the result is not what I expected. I don't understand how you can have a double number with -309 as the exponent. I thought the minimum exponent was -308
-
Because DBL_MIN isn't the smallest value possible for double. DBL_MIN is the smallest normalized value.
2.225074e-309 is, indeed an underflow (what IEEE call it a "gracious underflow") because it is a subnormal double value (where E == 0). A little change in your program can show this:
Code:
#include <stdio.h>
#include <float.h>
struct fp_s {
unsigned long long m:52;
unsigned int e:11;
unsigned int s:1;
} __attribute__((packed));
int main( void )
{
double x, y;
struct fp_s *px, *py;
px = (struct fp_s *)&x;
py = (struct fp_s *)&y;
x = DBL_MIN;
y = x / 10;
// Remember: the structure for normalized values (0 < E < 2047) is:
// v = (-1)^S * ( 1 + M/2^52 ) * 2^(E-1023)
// but for subnormal values (E=0) is:
// v = (-1)^S * ( M/2^52 ) * 2^-1022
//
printf( "%e (M=%#llx, E=%u, S=%u) %e (M=%#llx, E=%u, S=%u)\n",
x, (unsigned long long)px->m, (unsigned int)px->e, (unsigned int)px->s,
y, (unsigned long long)py->m, (unsigned int)py->e, (unsigned int)py->s );
// THIS is the smallest subnormal value (not 0.0) possible for a double.
// Yes... 0.0 is a subnormal special value (M=0, E=0).
px->m=1;
px->e=0;
px->s=0;
printf( "%e\n", x );
return 0;
}
Compiling and running:
Code:
$ cc -O2 -o test test.c
$ ./test
2.225074e-308 (M=0, E=1, S=0) 2.225074e-309 (M=0x199999999999a, E=0, S=0)
4.940656e-324