How accurate is the following...

**emeyer** · 12-05-2005

Hi all,

I'm working on the next piece of example code for my 'Learn C By Example Page' on my website.

How accurate is the following statement...

"The difference between a double and a float is that doubles are designed to be bigger than floats (basically twice (double) as big). Therefore a double is capable of storing a decimal value that has a significantly higher level of precision than a float."

What do you think

Eddie

**bithub** · 12-06-2005

Pretty accurate. The only misleading part is that doubles are not necessarily twice as big as floats. The only thing the standard specifies is that:

The set of values of the type float is a subset of the
set of values of the type double

**emeyer** · 12-06-2005

Hmmmm.

As I mentioned, I was planning to add to my 'Learn C By Example' page on my website. I wanted to provide a very basic introduction to some of C's basic variable types.

Here is what I had (see example program below)... what distinction would you recommend I make between a float and a double? Would it be safe to say that on most systems a doulbe would be bigger than a float and could therefore provide a higher level of precision. Anyway, tell me what you think... here is the example I put together. I know the program doesn't do anything useful... it is only meant as an introduction to some of C's basic variable types.

Code:

/*      Variables Program
 *      ====================
 *      Purpose: Demonstrating the creation and assignment of some simple variables.
 *      Author: Eddie Meyer
 *      Date: 5 DEC 2005
 */
#include <stdio.h>

int main(void)
{
	/* ============================================================
	 * Let's create (declare) some variables of different types.
	 * I've tried to give an indication of what each variable
	 * will hold by choosing variable names that essentially
	 * describe what they will be used for.  You are always
	 * recommended to try and do this to help others understand
	 * your code.  Oftentimes, it is you that will need reminding
	 * of what each variable in your program does.  Choosing
	 * descriptive variable names will prove extremely valuable
	 * to you when you come to review your code a year or two
	 * down the line.
	 * ============================================================
	 */


	/* Create a string (a sequence of characters).  Technically,
	 * there is no such thing as a string variable in C.  A string
	 * is really just a sequence (array) of characters.
	 *
	 * In this program, we will use a string variable to store my
	 * last name.
	 */
	char* last_name;


	/* Create a char (a single character value). For example,
	 * this could be used to store any single character in the
	 * alphabet.
	 *
	 * In this program, we will use a char variable to store my
	 * middle initial.
	 */
	char middle_initial;


	/* Create an integer (a whole number). Example values for
	 * an integer could be 0, 5, 384, -25 etc.  Note, that there
	 * is no decimal part to an integer.
	 *
	 * In this program, we will use an integer variable to store
	 * my age.
	 */
	int age;


	/* Create a float (a number with a fractional part). Example
	 * values could be 0.0, -5.6, 312.666 etc.
	 *
	 * In this program, we will use a float variable to store my
	 * hourly wage (in dollars and cents).
	 */
	float hourly_wage;


	/* Create a double (another number type with a fractional part).
	 * The difference between a double and a float is that doubles
	 * are designed to be bigger than floats (basically twice (double)
	 * as big).  Therefore a double is capable of storing a decimal
	 * value that has a significantly higher level of precision
	 * than a float.
	 *
	 * In this program, we will use a double variable to provide a
	 * measure of how much (compared to others) my intelligence
	 * may be of benefit to humanity.  We will want to use a double
	 * for this, because we know the value will be small.
	 */
	double effect_of_my_intelligence_on_humanity;


	/* ================================================================
	 * Now let's assign some values to the variables that we created
	 * ================================================================
	 */

	/* Use the 'last_name' variable to store my last name.
	 * Note the use of double quotes when dealing with strings.
	 */
	last_name = "Meyer";


	/* Use the 'middle_initial' variable to store my middle initial.
	 * Note the use of single quotes when dealing with chars.
	 */
	middle_initial = 'J';	/* For Jonathan	*/


	/* Use the 'age' variable to store my age */
	age = 30;


	/* Use the 'hourly_wage' variable to store my hourly wage
	 *(in dollars and cents).
	 */
	hourly_wage = 250.75;	/* Yah, I wish it were this much. */


	/* Use the 'effect_of_my_intelligence_on_humanity' variable to
	 * store a measure of how much my intelligence may be of benefit
	 * to humanity.  There is no scientific basis for the value I
	 * chose to use.  I do hope that my impact on humanity will
	 * indeed be positive however, regardless of how small it may
	 * turn out to be.
	 */
	effect_of_my_intelligence_on_humanity = 0.00000000000152;


    return 0;
}

Thanks for your input.

Eddie

**PING** · 12-06-2005

char* last_name

This should be an array. Something like char last_name[80];

**Thantos** · 12-06-2005

Well the why last_name is being used at this point having it as a pointer is fine as long as you aren't going to change it. Of course I would change it to a const char * in that case.

**filker0** · 12-06-2005

doubles are not twice the decimal range of floats; they extend the precision (that is, the number of significant digits that can be represented) of the real number representation. The absolute maximum and minimum values of doubles are different, but the intent is also to increase the precision with which those values can be expressed.

Here's an example program that demonstrates this:

Code:

#include <stdio.h>
int main(void)
{
  float fl = 2.0 / 3.0;
  double db = 2.0 / 3.0;

  printf("2/3 as float: %1.15f, double: %1.15f\n", fl, db);
  return 0;
}

When I compile and run this (on a mac-mini, though that doesn't really matter) I get the following:

Code:

MiniMac:~ filker0$ cc -o float float.c
MiniMac:~ filker0$ ./float 
2/3 as float: 0.666666686534882, double: 0.666666666666667

The header file /usr/include/float.h (or wherever your compiler puts its standard header files) contains the various limits for the various floating point representations supported by your compiler.

**quzah** · 12-06-2005

Originally Posted by filker0

When I compile and run this (on a mac-mini, though that doesn't really matter) I get the following:

Code:

MiniMac:~ filker0$ cc -o float float.c
MiniMac:~ filker0$ ./float 
2/3 as float: 0.666666686534882, double: 0.666666666666667

The header file /usr/include/float.h (or wherever your compiler puts its standard header files) contains the various limits for the various floating point representations supported by your compiler.

Although amusingly enough, you have just proved their statement to be true. The double in this case gives you just about twice as many points of accurate precision. Watch:
0.666666686534882 .. seven accurate positions.
0.666666666666667 .. fifteen accurate positions.

Funny, that.

[edit]
If you want to include the preceeding zero, you end up with exactly twice as many points of accuracy. 8 and 16 respectively. So for teaching purposes, yeah, it's a pretty accurate statement.
[/edit]

Quzah.

**filker0** · 12-06-2005

Originally Posted by quzah

Although amusingly enough, you have just proved their statement to be true. The double in this case gives you just about twice as many points of accurate precision.

Actually, I guess I didn't make my point clearly enough -- the example that he gave in his code sample didn't actually show a difference between float and double, as they can both represent 0.00000000000152 without loss of precision. If you update my format string to use "%1.30f" instead of "%1.15f" for both values, you get:

Code:

2/3 as float: 0.666666686534881591796875000000, double: 0.666666666666666629659232512495

which, as you can see, is actually more than twice the number of digits of accuracy. My choice of 15 fractional part digits was arbitrary.

The statement made by emeyer was not actually correct in its detail, nor was his code sample demonstrative of the actual difference between float and double. A google search on floating point representation should locate some good descriptions of how it all works. I know the internal workings of floating point representations, having written the compiler support code for several CPUs that lacked hardware floating point. (M68000, PDP-11, 6502) A float has 24 bits of mantissa, which is were the precision comes from. A double has 53 bits of mantissa. The double does take up twice the storage of a float, but the size of the exponent is not doubled, nor is there an extra sign bit.

He asked if his statement was accurate. It was not.

**quzah** · 12-06-2005

Originally Posted by filker0

Actually, I guess I didn't make my point clearly enough -- the example that he gave in his code sample didn't actually show a difference between float and double, as they can both represent 0.00000000000152 without loss of precision. If you update my format string to use "%1.30f" instead of "%1.15f" for both values, you get:

Code:

2/3 as float: 0.666666686534881591796875000000, double: 0.666666666666666629659232512495

which, as you can see, is actually more than twice the number of digits of accuracy.

Both of your examples produce "basically" the same result. The float remains accurate to 7 points in both; the double is 15, and 16 parts accurate in the second. Therefore the statement: "basically twice as big" IS in fact accurate. Unless you're debating now that 15 is not in fact "basically twice as big" as 7. In which case, I kindly direct your attention to a bit of integer math:

Code:

#include<stdio.h>
int main( void )
{
    int example1 = 15 / 7;
    int example2 = 16 / 7;

    printf("example1, 15 / 7 is %d\n", example1 );
    printf("example2, 16 / 7 is %d\n", example2 );

    return 0;
}
/*
example1, 15 / 7 is 2
example2, 16 / 7 is 2
*/

Oh, and actually, the whole statement is this:

"The difference between a double and a float is that doubles are designed to be bigger than floats (basically twice (double) as big). Therefore a double is capable of storing a decimal value that has a significantly higher level of precision than a float."

That statement is in fact true. doubles are designed to be bigger than floats. "Therefore a double is capable storing a decimal value that has a significantly higher level of precision than a float." See, that is in fact true. As you suggested, consult float.h for details. Or the standard. See, the minimum significant digits FLT_DIG and DBL_DIG are close to twice the size. 6 and 10 respectively. I'd say that is both significant, and "basicly double".

Quzah.

**PING** · 12-06-2005

Well the why last_name is being used at this point having it as a pointer is fine as long as you aren't going to change it. Of course I would change it to a const char * in that case.

Exactly. An array head is nothing but a const type *. He asked how accurate his info was..practically, i could change the value in last_name and make it point to something else. So, what he was going to put up on his site wouldn't be proper.
-Ping.

**Thantos** · 12-06-2005

Originally Posted by PING

Exactly. An array head is nothing but a const type *. He asked how accurate his info was..practically, i could change the value in last_name and make it point to something else. So, what he was going to put up on his site wouldn't be proper.
-Ping.

There was nothing "wrong" with it, there are just "better" ways. You are mistaken about the array head. One its not a pointer at all its an array. Second if you want to think of it as a pointer it would be a char * const pointer not a const char *.

**Thantos** · 12-06-2005

filker0,
The majority of the members that come to this site for help are beginners. We tailor our answers to that level of understanding. While something may or may not be perfectly accuracte we temper our answers as to assist the poster along the road of learning. If we gave the 100% exact answer we'd probably scare away the majority of the people. Of course there are some issues that are too important to curve the answer for.

**filker0** · 12-06-2005

quzah seems to take my posts as a personal affront on his person. It is not.

I was making a clarification on the

(basically twice (double) as big)

part of the statement that emeyer made. Because it was neither entirely correct nor entirely descriptive (since the range of the exponent is also larger in an IEEE double vs. an IEEE float), I decided to clarify some things. I don't want to get in to a full dissertation on the various floating point formats, nor the details of floating point manipulation. There have been, in the past, many different floating point representations. I'm sticking with IEEE for this discussion.

As for reading the standards -- I participated in the creation of the ANSI C89 standard while working at a C compiler company. I have written IEEE, Dec-Vax, and Motorolla "fast" floating point code, in assembler and C. I have copies of the publications in which the format is formalized. I've written the documentation on these floating point representations that shipped with the programmers guide for a commercial compiler.

And you get a little more than double the precision (53 vs. 24 bits of mantissa), and (because it's exponential) much more than double the range (10 bits vs. 8 bits of exponent) with an IEEE double vs. IEEE float representation.

In making my posting that quzah finds so absurd, I simply wanted to give emeyer a better example of the difference between float and double, and maybe encourage him to be more precise in his language. As a professional programmer, I know how important knowing this sort of knowlege is in avoiding difficult to locate coding errors in scientific and financial programs. (Financial programs do not, as a rule, use floating point at all for money calculations or storage; they tend to use fixed point to avoid rouding problems and other boundary conditions.)

I wrote my post to help emeyer, not pretend to have a point where I have none. quzah may choose to take issue with what I say; that's his right -- but that does not make my statements wrong.

**emeyer** · 12-06-2005

Originally Posted by filker0

(Financial programs do not, as a rule, use floating point at all for money calculations or storage; they tend to use fixed point to avoid rouding problems and other boundary conditions.)

What exactly is meant by fixed point? How do you ensure fixed point calculations in programs?

Thanks

Eddie

**filker0** · 12-06-2005

Originally Posted by emeyer

What exactly is meant by fixed point? How do you ensure fixed point calculations in programs?

Fixed point is where you have a representation of non-integer values that has a fixed precision. Other than the trivial case of integer math (which is, in a sense, fixed point, since there is no fractional part), fixed point is not directly supported by the standard C language. You can do fixed point many ways, and there are a number of libraries out there that do it different ways. Depending on the range you require (maximum/minimum values) and precision desired, different strategies are used.

In some cases, a fixed point value is represented by a structure containing a whole and fractional part. In these cases, all arithmetic on fixed point values is done by passing pointers to fixed point objects (structs) (or the structs themselves) to functions that have been defined by the fixed point library you're using -- these libraries enforce specific rounding/truncation rules on division and multiplication. (In C++, I've seen this done with classes and operator overloading, as in this link, but we're talking about C here.)

Another technique used for fixed point is BCD (Binary Coded Decimal), which predates digital computers. In this representation, four bits are used for each decimal digit. To use this in C, you need a BCD library.

Another common technique used for fixed point is to use an integer type, such as a long or a long long (64 bits), and have an implied decimal point. Here, you can use the built-in arithmetic in C, but multiplication and division require some fix-up afterwards to keep the decimal point in the right place. There are some issues about how wide the result of a multiply is (eg: to avoid overflow, multiplying two 32 bit fixed point numbers requires 64 bits for the result, then adjust the value to move the decimal point to the right place, then check for result overflow which keeps the fixed-point result from fitting in your normal fixed point representation, and if there is no overflow, just take the lower 32 bits -- what you do with overflow is application or library specific) and things of that sort, and your range will be somewhat limited. An example of this strategy can be found at this Stanford university web page.

Fixed point is sometimes used for high performance signal processing and graphics engines, since their coordinate space is limited and fixed, "enough precision" is well defined, and floating point operations are expensive; eg: this example from Intel.

Some special and general purpose processors support native fixed point formats. C compilers for those processors sometimes include extensions to allow C programs to declare and use fixed point variables, but these are non-portable.

Some programming languages do support built-in fixed point representations, such as PL/1 and Cobol.

A very good overview and source of fixed-point (and arbitrary precision) information and libraries is at this IBM site.

Thread: How accurate is the following...

Thread Tools

Search Thread

Display

Hybrid View

How accurate is the following...

Rebuttal

Fixed point

Similar Threads

Need advice to provide accurate time delay in C

Frustum cull is too accurate

accurate CPU temp reading

How do I create an accurate timer?

VERY accurate timing