Does anyone know if there are any difference in execution time for calculations using 32 bit float vs 64 bit double? Intiution says double should be slower for 32 bit processor and float slower on 64 bit processors.
Does anyone know if there are any difference in execution time for calculations using 32 bit float vs 64 bit double? Intiution says double should be slower for 32 bit processor and float slower on 64 bit processors.
Well your average PC desktop comes with a numeric co-processor which handles all floating point calculations in an 80-bit extended-double format.
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
The answer depends on the machine you're running on, particularly CPU architecture. There are some high end machines that only support 64 (or more) bit floating point types [which correspond to the double type in C/C++ on those machines] and smaller floating point types [which correspond to the float type] are emulated. This means that, if you work with lower precision types that any operation (eg addition, multiplication, etc) on the lower precision type is implemented by converting values to higher precision, doing the operation in higher precision, and then truncating to write the data back to a lower precision variable. On those machines, operations on higher precision floating point types can be faster than on lower precision floating point types.
The conversion opcodes for 32 to 64 and 64 to 32 consume the exact same number of CPU cycles. The opcodes to load a 32 bit real or 64 bit real also consume the same number of CPU cycles and in the Intel tech ref are listed on the same page as one another.On those machines, operations on higher precision floating point types can be faster than on lower precision floating point types.
This leads me to believe there is no apparent performance gain by choosing to use double's over float's or performance hit for using float's instead of double's.
The actual FPU real number format is larger than anything C/C++ can represent with it's built-in data types. For more information consult the Intel tech ref, book 2 (I believe) under Numeric Applications. For more information concerning opcodes consult book 3, Instruction Set Reference.
Back in the old days, when the 8086 and 8087 were separate chips, and expensive ones at that (comparitively speaking), many people got by without an 8087 and did all their floating point in software.
In those circumstances, there was a real choice to be made as to whether you used floating point at all, and whether to use float or double.
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
I suppose having access to things like the various versions of SSE would possibly speed up one over the other depending on the version, but that'd be about it.Originally Posted by Bubba
This discussion is specific to intell CPUs and FPUs. It is not universally true for other system architectures.Originally Posted by Bubba
This does not specify a CPU or an architecture.Does anyone know if there are any difference in execution time for calculations using 32 bit float vs 64 bit double? Intiution says double should be slower for 32 bit processor and float slower on 64 bit processors.
This is specific to Intel/AMD x86 based CPU's. Therefore my discussion was also related specifically to x86.Well your average PC desktop comes with a numeric co-processor which handles all floating point calculations in an 80-bit extended-double format.
When you're working on lots of them in memory (e.g. heightmap) then the bottleneck will be the memory IO, so the smaller size is faster.
For anything else, it probably doesn't matter, so just use whichever has the accuracy level you require, or requires the least conversions to interface with existing code or libraries.
....When you're working on lots of them in memory (e.g. heightmap) then the bottleneck will be the memory IO, so the smaller size is faster.
Internal bus data widths have long surpassed the available default data type data widths. Like in um...the 80286.