No you're quite correct in that case. If you only use one of the possible unswitched loops then is most certainly would be faster.
The possible slowdown would be if you unswitched the loops and then executed all of them one after the other. The significantly smaller code version that wasn't loop-unswitched would be in the instruction cache almost every time, unlike the unswitched version. Anyway I'm sure you understand, and no doubt it might not be an issue for you anyway.Yeah you need to be kidding. Anyone can think they are really good at optimisation, only to be blown away by someone else's orthogonal thinking.I doubt it - I have spent a while Just kidding. In any case, I still have a some optimizations to make myself, so I'd do those first anyway. Other than that, it's all quite messy at the moment.
For example do you know about the crazy trick for speeding up an if-statement such as this (assume x is unsigned):It's this:Code:if (x == 2 || x == 3 || x == 5 || x == 7 || x == 11 || x == 13 || x == 17 || x == 19 || x == 23 || x == 29 || x == 31)
which boils down to this:Code:if (x < 32 && ((1<<x) & (1<<2)|(1<<3)|(1<<5)|(1<<7)|(1<<11)|(1<<13)|(1<<17)|(1<<19)|(1<<23)|(1<<29)|(1<<31)) != 0)So now you do.Code:if (x < 32 && ((1<<x) & 0xA08A28AC) != 0)
If you didn't know about that trick, then case in point, what other awesome tricks might you not be aware of!
Anyone who truly wants well optimised code posts the actual code on here for others to do their insane magic on.
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"
That old cookie? Can't you do better do that?For example do you know about the crazy trick for speeding up an if-statement such as this (assume x is unsigned):
Just kidding (sorry). That is pretty advanced.
I'm kind of at the stage where I can start to do tricks like using pure malloced arrays instead of an array of structs, or using 1D contiguous arrays instead of 2D pointers to pointers kinda thing. And maybe using a relatively small sin/cos table (with linear interpolation for extra accuracy). A lot of tricks may however be a waste of time once the inevitable parallel processor paradigm comes into force within 5-10 years.
Last edited by twinbee; 05-21-2010 at 02:02 PM.
Just to update the situation of this problem: I have since upgraded from VS 2008 express to VS 2010 Professional and the problem still exists. Loop unswitching still isn't supported, at least judging by the simple example in my first post in this thread.
Last edited by twinbee; 04-05-2011 at 12:14 PM.
You still haven't shown that you proved it was a loop-unswitching issue by showing the generated assembly, nor have you shown that you manually implemented the loop unswitching in the C++ source code and got a performance increase.
Until that happens you can't make any statement, and certainly not a general one, about whether the compiler did or did not perform loop-unswitching, or whether or not it is even advantageous to do so.
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"
Not trying to troll, but ditch that MS piece of junk and use the Intel C++ compiler if you want performance.
Huh, I thought we established that VS was ignoring the loop-unswitching optimization. Simply removing the If statement altogether shows that it does indeed run faster. I haven't gotten round to learning assembly (CUDA's on my agenda at the moment), but if you must prove it, it would only take you (or someone else) a moment, as the code I supplied is very short. I'm 95% sure (maybe down to 80% since you're expressing doubts).
Yes I did. See my earlier post here in this very thread.nor have you shown that you manually implemented the loop unswitching
I would count that as 'proof', unless there's something so deviously sneaky going on that I wouldn't have the words to express.
That's all well and good, but it's a pretty simple feature, and considering that VS 2010 pro is so expensive, you'd think they'd have gotten their act together by now. More to the point, word spreads, and it'll hopefully slightly veer Microsoft to actually fixing it. Something is broken or lacking, and it's probably just loop unswitching, but if it's not that, then something screwy is definitely going on.Not trying to troll, but ditch that MS piece of junk and use the Intel C++ compiler if you want performance.
Code://try //{ if (a) do { f( b); } while(1); else do { f(!b); } while(1); //}
Well, "Space expensive" in that it can double, quadruple, or octuple a function's size which in many cases would still be absolutely fine.
Yes, I tried setting the "Favor Size or Speed" setting to "Favour fast code" with no luck unfortunately.
Last edited by twinbee; 04-05-2011 at 09:01 PM.