# Thread: floating point number comparison

1. ## floating point number comparison

For the case of yshow = 0.2000000000, when the code goes through the snippet below, the code inside the if statement is executed, which shouldn't happen in this case. Do you happen to know why this may occur? Thanks.

Code:
```		if (yshow < 0.0 || yshow > 0.2)
{
content.Format("Please enter a number within limit of %3.1f to %3.1f cm",
0.0, 5.0);
MessageBox(content, "Superior/Inferior Edit Box");
overlimit = true;
}```

2. If you search, here, google, otherwise, on the general subject of precision of floating point representations, you'll figure out why and what to do.

Floating point values aren't exactly represented. Some people seem to find it hard to get this firmly in their head. I see resistance to it rather frequently; a kind of frustration as if it simply can't be, but it's true.

Some values may be exact, like zero itself. Generally, certain other values seem to be stable, like 1.0. Your code may behave differently based on settings of a floating point unit (look into control87 and related C functions).

Most floating point comparisons should be made with a range of inexactness in mind. You'll find that many professional programmers start with a global constant, named something along the lines of Zero Range. Typical values might be 0.000006 or less, the idea being that after certain operations, a value that ought to be zero might not quite be zero, but would be less than zero range. Further, either that or another value, sometimes a constant named tolerance, is used to indicate the 'fudge' factor - the difference between your expected value and the actual value.

3. Thanks. I added the figit factor and it works fine now.

4. I would not use a fudge factor when doing IN-equality comparisons (as you're doing here). There really isn't a problem. What you have is a value that is slightly larger than 0.2, but not by enough to show when you print it. However, the value really is greater, and should be treated as greater.

The fudge factor is a hack to let you perform somewhat reasonable equality comparisons on floating point values. Like any other hack, the real solution is to redesign the code so that it does not depend on comparing floats for equality.

5. Would rounding yshow to the nearest integer and then use the integer to compare in the if statement be a better design?

6. Originally Posted by stanlvw
Would rounding yshow to the nearest integer and then use the integer to compare in the if statement be a better design?
I think the design is fine as-is. This is just a matter of precision when displaying the value. It really IS larger than 0.2.

Inequality comparisons on floats are generally fine.

7. Edit:

I personally think that in terms of the usability of your program using the small epsilon value comparison is appropriate. If the user is testing the program and wants to pass 0.2 to the program they should be able to do that without having to think to themselves 'oh yea floating point numbers blah blah blah'

It's good to educate yourself because there's lots of documentation. I liked learning about the representation of the floating point recently, at large values the floating point representation can't do very accurate values. For example for values of 2^23 the smallest +/- value for the float is 1.0

Code:
```#include <stdio.h>
#include <string.h>
union number {

// bitwise representation
struct float_field
{
unsigned int mantissa : 23;
unsigned int exponent : 8;
unsigned int sign : 1;
} field;

float float_rep;
int int_rep;
};

int main()
{
printf("%d\n", sizeof(number));

number n;
n.float_rep = 1.0f;

printf("%f : sign %x exp %x mantissa %x\n\n",
n.float_rep, n.field.sign, n.field.exponent, n.field.mantissa);

n.field.sign = 0;
n.field.mantissa = 1;
n.field.exponent = 0x7f+23;

// Excess-N notation on the exponent
// 0 = 7f, 127 = ff, -128 = 0

for(int i = 0; i < 10; i++)
{
printf("\t%f : sign %x exp %x mantissa %x\n",
n.float_rep, n.field.sign, n.field.exponent, n.field.mantissa);

n.field.mantissa++;
}

return 0;
}```
Code:
```4
1.000000 : sign 0 exp 7f mantissa 0

8388609.000000 : sign 0 exp 96 mantissa 1
8388610.000000 : sign 0 exp 96 mantissa 2
8388611.000000 : sign 0 exp 96 mantissa 3
8388612.000000 : sign 0 exp 96 mantissa 4
8388613.000000 : sign 0 exp 96 mantissa 5
8388614.000000 : sign 0 exp 96 mantissa 6
8388615.000000 : sign 0 exp 96 mantissa 7
8388616.000000 : sign 0 exp 96 mantissa 8
8388617.000000 : sign 0 exp 96 mantissa 9
8388618.000000 : sign 0 exp 96 mantissa a```

8. Originally Posted by Tonto
I personally think that in terms of the usability of your program using the small epsilon value comparison is appropriate. If the user is testing the program and wants to pass 0.2 to the program they should be able to do that without having to think to themselves 'oh yea floating point numbers blah blah blah'
I disagree. Using the "small epsilon value" comparison (essentially adding a fudge factor) simply changes the set of values that exhibit behaviours the user might not expect. It will pass for test cases you anticipate, but fail for others. Adding a small fudge factor can also cause values to pass (eg 0.2001 will pass if the fudge factor exceeds 10^-4) and such passes would be equally annoying to a user who doesn't expect them to happen.

The point is to implement a program that functions correctly within limits of how it's built. If your users care so much about such things, then the real solution is to avoid using floating point at all rather than fudging comparison operations or the method of printing them out. For example, don't use a floating point value to represent dollars and cents: use two variables of suitable integral type(s).

Also, if the program uses floating point for good reasons (eg it specifically performs some form of numerical analysis) then there is a case to argue that the user should be aware of the implications associated with floating point.

Originally Posted by Tonto
It's good to educate yourself because there's lots of documentation. I liked learning about the representation of the floating point recently, at large values the floating point representation can't do very accurate values. For example for values of 2^23 the smallest +/- value for the float is 1.0
You need to widen your self-education slightly. Floating point representations are not required to exhibit that property .... and in the real world, not all float representations do.

The code you gave (which I haven't quoted) exhibits implementation-defined behaviour - precisely because it relies on a specific method of representing the float type. It is not guaranteed to work with compilers, operating systems, or hardware different from yours.

9. One fairly common solution used in C++ is to deploy a class representing a fixed point data type, often made of two short integers (sometimes known as 16.16) - a quantity made from an integer and a fractional part, rounded to a fixed point.

Classes representing money are typical, too.

There are speed / accuracy trade-offs; the theme of several posts here indicate it's a substantial branch of study, and application level considerations must be part of your choice.

Kudos to grumpy in pointing out that this example highlights an implementation specific characteristic, which programmers should learn to avoid.

10. Originally Posted by brewbuck
I think the design is fine as-is. This is just a matter of precision when displaying the value. It really IS larger than 0.2.

Inequality comparisons on floats are generally fine.
I agree with brwebuck, the comparison is fine as-is. If you go adding epsilons in there then you just end up treating certain values wrongly that are even furthur from being correctly handled.
You could swap > for >= for example, but other than that it's fine.