# Thread: Help Dealing with Numbers in Double Type

1. ## Help Dealing with Numbers in Double Type

Hi,

For some reason I am having trouble manipulating large double numbers in C. For example, with the following code:

Code:
```#include <stdio.h>
#include <math.h>

int main(){
double x;

x = pow(2,100);
printf("%lf\n",x);

x = x + 111111111111.00;
printf("%lf\n",x);

return 0;
}```
The output is:

1267650600228229401496703205376.000000
1267650600228229401496703205376.000000

Which means "x" is not changing at all. The same problem happens when I try to divide "x" by 10. The division will be performed (and one digit will be removed), but the number itself will change (i.e., I'll lose the initial number).

Does anyone know what I am doing wrong?

Thanks

2. You are assuming incorrectly that the double type can contain arbitrarily large values to an infinite precision.

Imagine what would happen if you tried to represent a value of 15.5 in an integer. The value you would get would be 15 (assuming rounding down) or 16 (rounding up or nearest). Either way, adding 0.2 to that would give the same result (15 or 16) - i.e. adding 0.2 has the same effect as adding zero. If you multiply by 10, you will get 150 or 160 (not 155).

It is roughly the same phenomenon with a double, except that the difference between two adjacent large values that can both be represented in a double is also quite large. For large values like you are playing with, the difference between two consecutive values that can be represented in a double (for your compiler) comfortably exceeds 111111111111.00.

3. Ahh, I think I get it now. In fact I used the frexp function to see how the computer was storing that number. Here's the code:

Code:
```#include <stdio.h>
#include <float.h>
#include <math.h>

int main(){
double mantX;
int expX;

double x = 1267650600228229401496703205376.000000;

mantX=frexp(x,&expX);

printf("Numero %lf eh representado como %lf x 2 ^ %d\n",x,mantX,expX);

return 0;
}```
It turns out that the computer stores that number as 0.500000 x 2 ^ 101. So I calculated the next possible number, which is 0.500001 x 2 ^ 101, and it is:

1267653135529429857955506198782.410752

So making the subtraction:

1267653135529429857955506198782.410752 (next number)

-

1267650600228229401496703205376.000000 (original number)

=

2535301200456458802993406.410752 (minimum difference to produce a change)

Which means that difference between the two consecutive values on that scale is that last, pretty big number.

Am I correct?

4. Actually I think my reasoning was right, but I am not sure about the numbers.

Testing manually I found that the minimum number that will produce a change on the original double (1267650600228229401496703205376.000000) is 1111111111111111.000000.

So the next number possible for the compiler to represent is 1267650600228230527396610048000.000000.

BUT, when I use the frexp function on both of these numbers it turns out that their representations are the same: 0.500000 x 2 ^ 101, which puzzles me (i.e., it would be easier to understand if the numbers I stated in my previous post were the right ones).

5. If all you're displaying is "0.500000" and from that you are inferring the next number is "0.500001", then you are mistakenly assuming six decimal places is the resolution limit. In fact the type double is accurate to nearly 16 decimal digits.
See IEEE 754-2008 - Wikipedia, the free encyclopedia

There will also be a round-off error during converting internal binary to displayed base-10 representation and back again. If you are playing with the limits of floating point resolution, some of the discovery will be subject to those very same errors. Like trying to view molecular-sized objects with a microscope that uses a wavelength of light which are an elephant by comparison.

To start with, you should display a lot more that six decimal places.

6. Yeah I had forgotten about the 15 decimal digits of precision on doubles.

So I guess the numbers are this:

Original number is 1267650600228229401496703205376.000000, which is represented as 0.50000 x 2 ^101.

Next number then should be 0.500000000000001 x 2 ^ 101, which is 1267650600228254754508707769964.029934064.

So

1267650600228254754508707769964.029934064

-

1267650600228229401496703205376.000000

=

253530120045645.880299341

That last number should now be very close to the gap between two consecutive doubles of that size.

I am closer now?

7. And another way to calculate the gap between the numbers would be:

0.00000000000000001 (the minimum variation on the mantissa) * 2 ^ 101 = 253530120045645.880299341

Then we are able to calculate the minimum gap for numbers of different sizes. For example, if a given number is using an exponent of 50, then the minimum gap on that range would be:

0.0000000000000001 * 2 ^ 50 = 0.112589991

Please correct me if I am wrong.

8. I would say so. But the number is decimal digits is approximately 15.95. So try 0.50000000000000005. But as I said, we are relying on a slick decimal-to-binary converter... hopefully it will still detect such a small change. It is difficult to make a change in the 15.95th place when the resolution offered by the base-10 representation is in whole digits.

It would make more sense to review the floating point format and mathematically infer the minimum gap using sound math rather than relying on binary-to-decimal and decimal-to-binary algorithms. Though I am fairly sure those have been perfected to the best of their ability.

9. Agreed. I'll work on the mathematical approach now.

Thanks a lot.