Like Tree5Likes
  • 1 Post By MK27
  • 1 Post By MK27
  • 2 Post By iMalc
  • 1 Post By iMalc

How to determine the minimum value of floating-point types using direct computation?

This is a discussion on How to determine the minimum value of floating-point types using direct computation? within the C Programming forums, part of the General Programming Boards category; I'm trying to determine the minimum range of floating-point types. It's easy when using values from standard headers: Code: #include ...

  1. #1
    Registered User
    Join Date
    Apr 2011
    Posts
    11

    How to determine the minimum value of floating-point types using direct computation?

    I'm trying to determine the minimum range of floating-point types.

    It's easy when using values from standard headers:

    Code:
    #include <stdio.h>
    #include <float.h>
    
    main()
    {
        printf("Minimum range of float variable: %e\n", FLT_MIN);
        printf("Minimum range of double variable: %e\n", DBL_MIN);
        return 0;
    }
    This code gives the following output:

    Minimum range of float variable: 1.175494e-038
    Minimum range of double variable: 2.225074e-308

    How do I get to these same values using only direct computation?

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,487
    What do you mean by "direct computation"? Why would you want to do that? I'm not sure it's possible, since you would have to basically start at zero and go up by FLT_MIN or DBL_MIN, but that involves using values from the standard headers.

  3. #3
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,487
    Actually, you could try dividing by two repeatedly, but I don't know the floating point section of the C standard well enough to know if that will provide a well defined, portable solution.

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    I've at best a vague grasp of how floats are stored, but I don't see why it isn't possible -- it's going to be a consequence of the number of bits. Just you have to figure out what all dem bits mean

    Floating point - Wikipedia, the free encyclopedia
    IEEE 754-2008 - Wikipedia, the free encyclopedia
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    Registered User
    Join Date
    Apr 2011
    Posts
    11
    Quote Originally Posted by anduril462 View Post
    What do you mean by "direct computation"? Why would you want to do that? I'm not sure it's possible, since you would have to basically start at zero and go up by FLT_MIN or DBL_MIN, but that involves using values from the standard headers.
    Hi,

    I'm trying to solve exercise 2-1 from "The C Programming Language", 2nd edition, which asks to:

    "Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned,
    by printing appropriate values from standard headers and by direct computation. Harder if you compute them:
    determine the ranges of the various floating-point types."

    I need to determine the minimum range of floating-point types using
    basic mathematical operations like addition, subtraction, multiplication, division.

  6. #6
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by ak22 View Post
    I need to determine the minimum range of floating-point types using
    basic mathematical operations like addition, subtraction, multiplication, division.
    If this is any help, it appears that floats do "wrap around":

    Code:
    #include <stdio.h>
    
    int main(int argc, const char *argv[]) {
    	float n = 0.01, x;
    	while (n > 0.0f) {
    		x = n;
    		n *= 0.1f;
    	}
    	printf("%e\n", x);
    	return 0;
    }
    Will give you the smallest positive value of a signed float. My idea is that the fact that it wraps around means that you could keep subtracting an exponentially (*10) increasing number from 0 until it becomes positive, then adding to that a number *5 until it is positive, then subtracting from that a number *2.5, etc, until you reach a point where only one iteration is required -- I'm not going to bother, and I can't say it is feasible or worthwhile.

    Which, BTW & honestly, there are more significant details to understand about floating point numbers than this. Eg, that they cannot exactly represent 0.1, because 0.1 is not an inverse power of 2. Try this:

    Code:
    #include <stdio.h>
    
    int main() {
            float i;
            for (i=0.0f; i<20; i+=0.1f) {
                    printf("%f\n",i);
            }
            return 0;
    }
    The output is probably not what you would expect.
    ak22 likes this.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  7. #7
    Registered User
    Join Date
    Apr 2011
    Posts
    11
    I just realized that the following two minimum values:

    Minimum range of float variable: 1.175494e-038 and
    Minimum range of double variable: 2.225074e-308

    are actually 0. (I've put the %e specifier instead of %f).

    So, I just had to find the maximum value for float and double, which was easy.

    There's only one thing left that bothers me: I can't find the maximum value of a long double variable.

    The following code:

    Code:
    #include <stdio.h>
    #include <float.h>
    
    main()
    {
        printf("Minimum range of long double variable: %f\n", LDBL_MIN);
        printf("Maximum range of long double variable: %f\n", LDBL_MAX);
        return 0;
    }
    prints the following:

    Minimum range of long double variable: -0.000000
    Maximum range of long double variable: -1.#QNAN0

    Why does it print QNAN0 instead of max value?

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Because printf by default uses 6 decimal places. Try this:

    Code:
    	printf("%f\n", 0.000000000001);
    	printf("%.20f\n", 0.000000000001);
    So...

    I just realized that the following two minimum values:

    Minimum range of float variable: 1.175494e-038 and
    Minimum range of double variable: 2.225074e-308

    are actually 0. (I've put the %e specifier instead of %f).
    No they aren't. %e is accurate, and printf is not a valid test (tho, if you try %.50f, you will see it). This is the real test:

    Code:
    if (x == 0.0f) ...
    However, it is generally considered bad practice to use == with floating point numbers because of the abstraction involved in their representation (did you try the example at the end of post #6? That reveals a very important fact about floating point). Instead, in practice use a range with some granularity, eg:

    Code:
    // instead of == 1
    if (x < 1.0001 && > 0.9999)
    Last edited by MK27; 12-01-2011 at 10:11 PM.
    ak22 likes this.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I use %Le as the format string for printf() with LDBL_MAX. %Lf works also, but the number of zeroes makes it quite a task to read!

  10. #10
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,304
    This is a perfectly reasonable thing to do. In fact to test my own megafloat class that implemented 128-bit or greater floats, to confirm that the value returned as being its minimum is correct, I essentially compare it against the result of repeated multiplication by 0.5 until just before it becomes zero.

    MK27: It must be 0.5f though that you multiply it by because 0.5 can be represented exactly in floating point, but 0.1 cannot. Besides, multiplying by any value less than 0.5f e.g. 0.125f would potentially miss the smallest value.

    MK27 is very much right though that the float min value is not equal to zero. Depending on your format string it may be displayed as merely rounded to zero.

    Note that the same technique can be used to find the largest value. Simply multiply by 2.0f until the answer is infinity, which IIRC can be detected by (x-x) != 0
    ak22 and MK27 like this.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  11. #11
    Registered User
    Join Date
    Apr 2011
    Posts
    11
    I think I figured out how to get the minimum values of floating-points.

    The following code:

    Code:
    #include <stdio.h>
     
    main()
    {
        float fl, last;
        fl = 0.1;
        while (fl > 0.0) {
            last = fl;
            fl = fl * 0.2;
        }
        printf("\nMinimum value of float variable is: %e\n", last);
    
        double db, dblast;
        db = 0.1;
        while (db > 0.0) {
            dblast = db;
            db = db * 0.2;
        }
        printf("Minimum value of double variable is: %e\n", dblast);
        return 0;
    }
    prints:

    Minimum value of float variable is: 1.401298e-045
    Minimum value of double variable is: 4.940656e-324

    Are these values correct?

    Values from float.h are different:

    Minimum range of float variable: 1.175494e-038
    Minimum range of double variable: 2.225074e-308

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,304
    Okay two things:

    One, you've completely ignored my post. To ensure that you get the correct answer, you need to use values that are exactly representable. So, with these changes it will correctly produce the minimum denormalised value.
    Code:
    #include <stdio.h>
    int main(){
        float fl, last;
        fl = 1.f;
        while (fl > 0.f) {
            last = fl;
            fl = fl * 0.5f;
        }
        printf("\nMinimum value of float variable is: %e\n", last);
        double db, dblast;
        db = 1.0;
        while (db > 0.0) {
            dblast = db;
            db = db * 0.5;
        }
        printf("Minimum value of double variable is: %e\n", dblast);
        return 0;
    }
    FTFY

    Secondly, I now recall that FLT_MIN and DBL_MIN give the minimum normalised value, rather than the minimum denormalised value.

    One way to obtain the minimum normalised value is that you could start with a float that uses all bits of the significand as well as one which uses all but the last bit of the significand. This can be generated by adding successively smaller powers of two until the value does not change. The value before it did not change and the value before that are the ones with full significands and almost full significands.
    Then repeatedly multiply both of these by 0.5 until they become equal.
    Then multiple one of them by 2.f and you have the answer.
    This process takes advantage of the fact that once the value becomes denomralised, the number of bits in the significand decreases, and in fact as I experienced, rounding also occurs.

    Edit: It was slightly easier than I thought due to rounding. Corrected description above.
    Last edited by iMalc; 12-03-2011 at 12:59 PM.
    ak22 likes this.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Registered User
    Join Date
    Apr 2011
    Posts
    11
    Thanks a lot, guys. I'm beginning to grasp how floating-point numbers work.

    One way to obtain the minimum normalised value is that you could start with a float that uses all bits of the significand as well as one which uses all but the last bit of the significand. This can be generated by adding successively smaller powers of two until the value does not change. The value before it did not change and the value before that are the ones with full significands and almost full significands.
    Then repeatedly multiply both of these by 0.5 until they become equal.
    Then multiple one of them by 2.f and you have the answer.
    This process takes advantage of the fact that once the value becomes denomralised, the number of bits in the significand decreases, and in fact as I experienced, rounding also occurs.
    I will try to write a code for this. This looks a little harder.

  14. #14
    Registered User
    Join Date
    Apr 2011
    Posts
    11
    I solved it!

    Code:
    #include <stdio.h>
    
    main()
    {
        float fl, last, last_b, power;
        double db, lastdb, lastdb_b, power_b;
        fl = 0;
        power = 1;
        while (fl != last) {
            last_b = last;
            last = fl;
            fl = fl + (power / 2);
            power = power / 2;
        }
        while (last != last_b) {
            last = last * 0.5;
            last_b = last_b * 0.5;
        }
        printf("\nMinimum value of denormalized float variable: %e", last_b);
    
        db = 0;
        power_b = 1;
        while (db != lastdb) {
            lastdb_b = lastdb;
            lastdb = db;
            db = db + (power_b / 2);
            power_b = power_b / 2;
        }
        while (lastdb != lastdb_b) {
            lastdb = lastdb * 0.5;
            lastdb_b = lastdb_b * 0.5;
        }
        printf("\nMinimum value of denormalized double variable: %e\n", lastdb_b);
        return 0;
    }
    this prints:

    Minimum value of denormalized float variable: 1.175494e-038
    Minimum value of denormalized double variable: 2.225074e-308

    (I didn't multiply with 2.f at the end, because somehow the compiler already produced the correct result)

    Thanks to everyone who helped me.
    This exercise was awesome!
    Last edited by ak22; 12-04-2011 at 04:47 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. direct computation
    By Tool in forum C Programming
    Replies: 36
    Last Post: 12-17-2009, 01:02 PM
  2. Floating point
    By Flip in forum C++ Programming
    Replies: 7
    Last Post: 12-05-2005, 11:29 AM
  3. fixed point / floating point
    By confuted in forum Game Programming
    Replies: 4
    Last Post: 08-13-2002, 01:25 PM
  4. Determine file types in C?
    By +ChargeR- in forum C Programming
    Replies: 3
    Last Post: 07-31-2002, 06:02 PM
  5. Floating point faster than fixed-point
    By VirtualAce in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 11-08-2001, 10:34 PM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21