Thread: Bumping against the limits of precision

  1. #1
    Registered User
    Join Date
    Jun 2009
    Posts
    486

    Bumping against the limits of precision

    I have a complex little formula in my code in which variables are subtracted that are very close together. Their difference is many orders of magnitude smaller than the numbers being subtracted. C is evaluating the expression incorrectly.

    The formula is as follows (all variables are doubles)

    Code:
    operand = a*a*(108.0*a*c*c*c + 108.0*b*b*b*d - 27.0*b*b*c*c - 486.0*a*b*c*d + 729.0*a*a*d*d);
    The variables have the following values as calculated by the code

    Code:
    a: 2.0287864134649038e+07
    b: 1.9026877897384940e+04
    c: 4.4610669723446748e+00
    d: -3.7933051584501112e-29
    
    108.0*a*c*c*c: 1.9452539813523886e+11
    108.0*b*b*b*d: -2.8219163158723021e-14
    27.0*b*b*c*c: 1.9452539813523883e+11
    486.0*a*b*c*d: -3.1746558553563405e-14
    729.0*a*a*d*d: 4.3175338098787823e-40
    compared to composite values calculated by hand using a calculator with better precision:

    Code:
    108.0*a*c*c*c: 1.94525398135238912527e11
    108.0*b*b*b*d: -2.82191631587230194283e-14
    27.0*b*b*c*c: 1.94525398135238861191e11
    486.0*a*b*c*d: -3.17465585535634052349e-14
    729.0*a*a*d*d: 4.31753380987878243389e-40
    You can see that there are subtle differences between the values in the code and there, and that the first and third term almost cancel each other out, except for in the last few decimal places, which are not especially reliable.

    The values calculated by the code give a value for operand of 1.2560956774113516e+10

    The correct answer is 2.11297657006282327491e10.


    So: I am bumping up against the limits of hardware precision here, I think. I need a way to reliable subtract numbers that only differ in maybe the 14th decimal place. Is such a things possible?

    I have been playing around with order of operations in the formula, factoring out various terms. The value of operand can be changed dramatically just by changing the order in which things are done.

    I would like to avoid using a higher precision library if at all possible - this will be executed several billion times in the course of a run and using software precision is going to slow things down a lot. Do I have alternatives?

    Thanks
    C is fun

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    What variable type are you using? If you're using float, you can upgrade to double. It still suffers similar precision issues, but with 64-bits, you get a lot better resolution. If doubles don't work, then you'll have to either use a 3rd party library, like GMP, or roll your own. Note that libraries like GMP are usually heavily optimized, so the performance hit might not be as bad as you expect.
    Last edited by anduril462; 04-28-2011 at 10:00 AM.

  3. #3
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    I am using doubles, so the next step would be a 3rd party library I suppose. What a headache floating point subtraction is ^_^
    C is fun

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,665
    Perhaps you could consider rearranging the expression into some equivalent form which avoids the issue?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Instead of absolute comparison between A and B, you need to consider a comparison tolerance or epsilon.
    The error should be some amount less than the magnitude of the numbers.
    If using 'double' don't expect a precision greater than 15.9 decimal places. Due to round-off, shave off a few from that.
    So perhaps 14 place accuracy.
    Pseudo code:
    if min(log10(A), log10(B)) - log10(abs(A - B)) >= 14 then they are "equal"

  6. #6
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    @Salem: I have been trying. As I said, the value of operand can be drastically changed just by reordering the equation and factoring things out. But so far, nothing is particularly reliable.

    @nonoob: That is precisely the problem. Your test will show that the numbers are equal, but they are not, and that tiny (relative to the numbers themselves) difference is all-important to my calculation.
    C is fun

  7. #7
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    The problem you are having is with Numerical Stability.

    In short, adding or subtracting values of sufficiently differing magnitude loses precision.
    It can be solved by putting all the values to add up into an array, sorting the values in the array ascending by their absolute value, and then adding them up starting from the smallest.

    In C++ you could so it like so:
    Code:
    double vals[] = {108.0*a*c*c*c,  108.0*b*b*b*d, -27.0*b*b*c*c, -486.0*a*b*c*d, 729.0*a*a*d*d};
    std::sort(vals, vals+sizeof(vals)/sizeof(vals[0]), myMagnitudeComparator()); //"batteries not included", so to speak
    operand = a*a*std::accumulate(vals, vals+sizeof(vals)/sizeof(vals[0]), 0.0);
    (Note that I'm assuming that std::accumulate adds the values up from first to last)
    If you have trouble translating that to C, let me know.
    Last edited by iMalc; 04-29-2011 at 12:21 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  8. #8
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    Thanks iMalc, but that is not quite the problem I am having. My problem is that I have two numbers which are almost equal and differ only in the 14th (say) decimal place, and I need to extract that difference.

    eg:

    a = A
    b = A - e

    where e/A ~1e-14

    what is a-b? C says 0, or possible something along the lines of e rounded to one decimal place.
    C is fun

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,665
    Well there in lies the problem.
    If the two numbers are identical in 14 decimal digits, then all you're going to get is 1 or 2 digits of precision when you subtract them. It can't "magic" another 14 bits of precision out of nothing, so you just get a whole load of noise bits instead.

    Similarly, if one number is 15 decimal digits larger than another (say subtracting 1 from 1E15, then nothing will happen there either).

    This long read explains what you're up against.
    What Every Computer Scientist Should Know About Floating-Point Arithmetic
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    Thanks for the link.

    I understand the problem, I was just hoping that this was a problem that had a standard approach to solving. I'll comb through the article and see what I can come up with.
    C is fun

  11. #11
    Registered User
    Join Date
    Oct 2008
    Location
    TX
    Posts
    2,059
    Quote Originally Posted by KBriggs View Post
    Code:
    operand = a*a*(108.0*a*c*c*c + 108.0*b*b*b*d - 27.0*b*b*c*c - 486.0*a*b*c*d + 729.0*a*a*d*d);
    The variables have the following values as calculated by the code

    Code:
    a: 2.0287864134649038e+07
    b: 1.9026877897384940e+04
    c: 4.4610669723446748e+00
    d: -3.7933051584501112e-29
    
    108.0*a*c*c*c: 1.9452539813523886e+11
    108.0*b*b*b*d: -2.8219163158723021e-14
    27.0*b*b*c*c: 1.9452539813523883e+11
    486.0*a*b*c*d: -3.1746558553563405e-14
    729.0*a*a*d*d: 4.3175338098787823e-40
    compared to composite values calculated by hand using a calculator with better precision:

    Code:
    108.0*a*c*c*c: 1.94525398135238912527e11
    108.0*b*b*b*d: -2.82191631587230194283e-14
    27.0*b*b*c*c: 1.94525398135238861191e11
    486.0*a*b*c*d: -3.17465585535634052349e-14
    729.0*a*a*d*d: 4.31753380987878243389e-40
    You can see that there are subtle differences between the values in the code and there, and that the first and third term almost cancel each other out, except for in the last few decimal places, which are not especially reliable.

    The values calculated by the code give a value for operand of 1.2560956774113516e+10

    The correct answer is 2.11297657006282327491e10.
    Not sure if above answer is correct. Because putting those numbers and equations into bc (the *nix calculator with 99 digits of precision) I get a different answer. Or maybe I made an error somewhere. Lemme know if you have access to bc, so I can give you the bc script to figure out if the output you're getting with the high precision calculator is correct or not.

  12. #12
    Registered User
    Join Date
    Jun 2009
    Posts
    486
    Actually I think I made a mistake. Redoing it I get operand = 9.6474321900824630266e9. Is that what you got?

    The code is still wrong, though ^_^

    I am using speedcunch
    C is fun

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. bumping threads
    By creeping_death in forum Windows Programming
    Replies: 1
    Last Post: 08-27-2003, 02:50 AM
  2. limits.h
    By kostas in forum C Programming
    Replies: 10
    Last Post: 03-19-2002, 06:36 PM