# Thread: Difference between float and double?

1. ## Difference between float and double?

I have assignment to build quadratic equation roots calculator, and i was wondering what was the difference between float and double.

For example

int a, b , c = 1;
float x1, x2, d;

d = (float)b*b - 4*a*c;

Or:

int a, b , c = 1;
double x1, x2, d;
d = b*b - 4*a*c;

What is the difference between them?

2. A double is double the size of a float.

A float is a "floating point number" (vs fixed point, ie. decimal or long long) - a variable used to represent decimals/fractions.

A double is 2x the size of a float. Sort of like long vs long long, although a double can obviously lead to more precision as it's a floating point.

EDIT:
That's double the size in terms of bit representation, not double like "times 2"

So if a float was 32 bits (I think it is off-hand, but I'm not positive), a double would be 64 bits.
It is larger regardless of the processor bus size (you can use 64 bit doubles in 32 bit OS/Compiler). Use a float unless you need extra precision or a very large decimal value.

3. Welcome to the forum, Winten!

4. Only guaranteed difference is the precision: float is accurate to at least 6 significant digits, and double is accurate to at least 10. That's it.

Everything else is up to the implementation.

5. Originally Posted by msh
Only guaranteed difference is the precision: float is accurate to at least 6 significant digits, and double is accurate to at least 10. That's it.

Everything else is up to the implementation.
Ah, I didn't know that was all that was guaranteed. This C99 spec?

6. C Primer Plus, 5th ed. claims that this is according to C99. I assume it's correct, and that someone smarter then me will correct it if it's not.

7. Originally Posted by msh
C Primer Plus, 5th ed. claims that this is according to C99. I assume it's correct, and that someone smarter then me will correct it if it's not.
Yeah, I doubt that it's wrong, although I've seen typos in books, especially C and C++ books where the implementation is pretty rigid (opposed to say, perl).

I looked at the C99 draft spec, says that
- Both a float and a double must provide a limit greater than or equal to 1E+37.
- Both a float and a double must provide a limit less than or equal to 1E-37.
- A float must provide 6 or more digits
- A double must provide 10 or more digits

That's the way I'm reading the C99 draft spec, I could be reading it wrong. Though the way I read it, it's speaking of overall digits, not necessarily just significant digits. As it's a floating point, I believe those can be 1dig.9dig or 9dig.1dig, or anywhere in between.

That's just the way I'm reading it, though I could be wrong.

8. Originally Posted by Syndacate
That's the way I'm reading the C99 draft spec, I could be reading it wrong. Though the way I read it, it's speaking of overall digits, not necessarily just significant digits. As it's a floating point, I believe those can be 1dig.9dig or 9dig.1dig, or anywhere in between.
That's exactly what's meant my significant digits, at least from my experience.

9. Originally Posted by msh
That's exactly what's meant my significant digits, at least from my experience.
Ah, yes, that didn't click earlier when I read it.

Though the max/min is still spec, though that's probably just be a side effect of using the number of significant digits required for the spec all at their highest value.

10. I would go by the IEEE standard for floating point representation.
Hardware has implemented floating-point math support for decades now... and it's unlikely there'd be oddball compilers and platforms that would deviate from these standards.

IEEE 754-2008 - Wikipedia, the free encyclopedia

Single precision gives 7 decimal digits accuracy.
Double gives almost 16. ("At least 10" is true by the definition)

11. Originally Posted by msh
Only guaranteed difference is the precision: float is accurate to at least 6 significant digits, and double is accurate to at least 10. That's it.

Everything else is up to the implementation.
is there a way to make it go out to 20 instead of just 10?

12. One ten billionth is not enough accuracy for you? (that's the minimum, that the Standard specifies for doubles). I'd use doubles, not floats.

Never heard of any way to increase that accuracy.

13. no i found it just using %.20f during the printf

14. Originally Posted by shouse
no i found it just using %.20f during the printf
That has nothing to do with the accuracy of your floating point calculations. Try this:
Code:
`printf("%.20f\n", 1.0 / 3.0);`
If your hypothesis is correct (i.e. that %.20f gives 20 digits precision), you would get 0.333 repeating. Odds are you won't (although it does depend on the accuracy of your doubles).

You can try using a long double, which was introduced in C99; it's not guaranteed to have greater accuracy than double, but it well might.

15. Originally Posted by shouse
is there a way to make it go out to 20 instead of just 10?
Sure, I have a library that makes that possible, in C++ that is.
I call my 128-bit version a 'quad'.