a few opengl(or general programming) questions

**Perspective** · 06-13-2004

alright, heres my 2 cents, just for the sake of argument.

while at the CPU level floats and doubles may have (essentially) equal performance, doubles still take twice the memory space. thus, when memory is accessed only half the number of doubles will make it over the BUS to cache as will floats. In short, to access the same number of floats as doubles you will have to access memory more times (assuming you access enough to cause more than one BUS ride)

heres a simple example to clarify what im saying,
Lets say BUS is 16 bytes wide. One memory access will get 2 doubles or 4 floats. If you access 4 doubles from an array and 4 floats from an array, the float access will cause one BUS transaction while the double access will cause 2. More trips to memory will amount to slower performance. If there are little blue dots on your bread or little green ones on your cheese you shouldnt eat it.

**~~bludstayne~~** · 06-13-2004

I thought that if the parameters to the function were floats that if you passed a double to it it would convert it to a float.

**linuxdude** · 06-13-2004

that's what I assume

**Perspective** · 06-13-2004

Originally Posted by bludstayne

I thought that if the parameters to the function were floats that if you passed a double to it it would convert it to a float.

it will, you might get a compiler warning though.

**~~bludstayne~~** · 06-13-2004

Then why does it matter if you pass the parameters with the suffix or not?

**VirtualAce** · 06-14-2004

while at the CPU level floats and doubles may have (essentially) equal performance,

There is nothing essentially about it - they have equal performance. Since the Intel docs specify that even an m32real and m64real being loaded from memory perform at exactly the same speed. Regardless of bus the CPU has been designed so that a float fetch or a double fetch as it relates to pushing values onto the FPU stack (keep in mind this only applies for pushing values onto ST(0)) operate at the same speed. If this were not true then the clocks would change.

I'm quite sure that motherboard manufacturers have taken into consideration the design of the CPU and the FPU so that what Intel says...happens on the board. The CPU can fetch a double and a float at the same speed. Perhaps this is accomplished by actually slowing down the float fetch to equal the double fetch thereby making them look like they perform equivalently...but I doubt it. Perhaps there is some SIMD taking place under the hood of the modern FPU which allows a single instruction but multiple data in one fetch over the bus. MMX performs somewhat of the same operations via the FPU as well.

This would lend credence to the fact that perhaps our CPUs ARE currently capable of performing a 64-bit fetch....but Intel has only enabled it for the FPU. This would follow with the Intel design philosophy since most, if not all, of their CPUs in the past have been able to do much more than was ever supported. For instance early CPUs like the 8088 had a 16 bit bus.....but internally actually had a 32-bit bus and could fetch 32-bits at a

time...but the 32-bit registers were not exposed to the programmer nor were the 32-bit fetch opcodes. It is quite possible and probably that given the design of MMX/SSE/SSE2 and the FPU that perhaps the CPU is performing a 64-bit fetch, but only using these opcodes. The 64-bit fetch is not supported on 32-bit CPUs and is not apparent to the programmer.....but I bet that is exactly what is happening internally to the CPU.

Why did they not enable 64-bit fetches for memory....I have no idea. My guess would be marketing strategy. Also it is quite possible that the new 64-bit CPUs actually have 128-bit fetches. Whatever the case...there is more going on internally than Intel is telling me in the book. Otherwise MMX/SSE/SSE2 and FPU opcodes would pose little to no advantage over the standard 32-bit x86 opcodes. Is it not peculiar how suddenly with MMX we can operate on 64 bits at a time and yet do so at blazing speeds over a 32-bit bus??? Something's fishy there.

Also this float and double data type stuff is compiler side stuff. To load either a 32-bit or 64-bit floating point value you must use FLD. Your compiler is using this somewhere so it doesn't matter assembler side whether it is 32-bits or 64...its going to use the FLD instruction and the FPU will zero fill the insignificant bits in your value. I think that it might zero extend the value but I'm not sure and I'm not sure if you can zero extend a floating point value.

If the bus were a factor then the clock speeds would be different for loading from memory and they are not.

Thread: a few opengl(or general programming) questions

Thread Tools

Search Thread

Display

Similar Threads

questions....so many questions about random numbers....

A very long list of questions... maybe to long...

Several Questions, main one is about protected memory

Trivial questions - what to do?

questions questions questions.....