Seems much too slow for what it is. (using GCC 3.2 -O3)Code:int dx, dy; ... return ((dx << 4) + (dx << 2) + (dy << 4) - (dy << 2)) >> 4;
Seems much too slow for what it is. (using GCC 3.2 -O3)Code:int dx, dy; ... return ((dx << 4) + (dx << 2) + (dy << 4) - (dy << 2)) >> 4;
How did you measure how long it takes?
It generates all of 11 instructions here, difficult to see how it could be any less since you start out with 8 operators in your expression.
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
Could simplify to:
(5 * dx + 3 * dy) / 4
though since you like using the shifts:
(dx + (dx << 2) + dy + (dy << 1)) >> 2
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
Ah, it's not premature optimisation at work is it?
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
Yeah the bottleneck probably lies elsewhere, it's not like the formula is particularly complex.Ah, it's not premature optimisation at work is it?
Look up a C++ Reference and learn How To Ask Questions The Smart WayOriginally Posted by Bjarne Stroustrup (2000-10-14)
It's part of a slowish rendering thing, and runs this function hundreds of times per pixel. There are a couple of oddities in the generated code, like jumping to a subroutine just to negate ecx, then jumping back. I'll try rewriting it in asm, thanks for the help.
Get rid of your function. Make it a macro.
Quzah.
Hope is the first step on the road to disappointment.
Originally Posted by quzah
Exactly, it's probably the overhead of calling the function more than anything.
Alternately you could put "inline" before the function definition but making it a macro would be better practice.