![]() |
| | #31 |
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| Carmacks approximation seems interesting, as it's basically 2 iterations of the customary loop. I wonder how well it performs on a larger range of numbers. It also "messes up" the floating point & integer units, as it is overlaying FPU data with integer data to do integer subtraction of it. It's a bad idea to do that unless absolutely necessary, since it causes the processor to have to sync the FPU with the integer unit - normally the integer unit will operate independently of the FPU, and both units will "prefer" to work independently. In general, SIMD operations is only "meaningful" if there is a complete set of data. -- Mats
__________________ Compilers can produce warnings - make the compiler programmers happy: Use them! Please don't PM me for help - and no, I don't do help over instant messengers. |
| matsp is offline | |
| | #32 | ||
| Registered User Join Date: Mar 2005 Location: Mountaintop, Pa
Posts: 1,059
| Quote:
Quote:
For example, using Code: float fInput1[3] = {30.3F, 100.0F, 140.1F};
So, to eliminate this spike, you have to submit the following for square root calculation of three floats: Code: float fInput2[4] = {30.3F, 100.0F, 140.1F, 0.0F};
| ||
| BobS0327 is offline | |
| | #33 | ||
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| Quote:
Code: lTemp = * ( long * ) &fY;
lTemp = 0x5f3759df - ( lTemp >> 1 );
On older processors, things didn't happen much in parallel, so there was less of a problem with this style of code. Modern processors definitely execute a lot of instructions in parallel, and there's some pretty complicated logic to prevent one or the other unit from getting it wrong when overlapping work between two units - and one possible scenario is "speculative execution and throwing away the results". Did I mention that modern processors are quite complicated? ![]() Quote:
1. The [3] array is misaligned, which causes the processor to use a unaligned version of the "load" instructions. 2. The [3] array overlaps the result array by 1 element, which means that the processor gets confused as to the content (and must wait for other operations before it can continue, to make sure it doesn't "get it wrong". -- Mats
__________________ Compilers can produce warnings - make the compiler programmers happy: Use them! Please don't PM me for help - and no, I don't do help over instant messengers. | ||
| matsp is offline | |
![]() |
| Tags |
| c++ code, code, square root |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Pointer confusion | Blackroot | C++ Programming | 11 | 09-12-2007 12:44 AM |
| Issue w/ Guess My Number Program | mkylman | C++ Programming | 5 | 08-23-2007 01:31 AM |
| Finding the square root! Not Working! | Lah | C Programming | 5 | 09-14-2003 07:28 PM |
| Square Root | Kyoto Oshiro | C++ Programming | 5 | 09-05-2002 01:22 AM |
| can anyone find the problem in my code | ArseMan | C++ Programming | 2 | 09-20-2001 09:02 PM |