Lack of compiler loop optimization (Loop unswitching) ?

This is a discussion on Lack of compiler loop optimization (Loop unswitching) ? within the C Programming forums, part of the General Programming Boards category; Originally Posted by twinbee ...When he said "Ugly, and a pain in the ass in the debugger", did he mean ...

  1. #16
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,411
    Quote Originally Posted by twinbee View Post
    ...When he said "Ugly, and a pain in the ass in the debugger", did he mean the C++ template solution as well as the c macro solution, or just the c macro solution?
    Only the macro solution.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  2. #17
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,288
    Quote Originally Posted by twinbee View Post
    Correct me if I'm wrong, but the 'chosen' loop out of the say, 32 'mirrors' would be the one that's being constantly accessed. Thus, I believe the CPU would intelligently keep these in the fastest level cache possible, by finding out that these instructions are being used the most. In other words, even if there 100 meg of program instructions, as long as only 1k of them being accessed during the program's run, then these will be prioritized massively in the CPU's instruction cache.
    No you're quite correct in that case. If you only use one of the possible unswitched loops then is most certainly would be faster.
    The possible slowdown would be if you unswitched the loops and then executed all of them one after the other. The significantly smaller code version that wasn't loop-unswitched would be in the instruction cache almost every time, unlike the unswitched version. Anyway I'm sure you understand, and no doubt it might not be an issue for you anyway.
    I doubt it - I have spent a while Just kidding. In any case, I still have a some optimizations to make myself, so I'd do those first anyway. Other than that, it's all quite messy at the moment.
    Yeah you need to be kidding. Anyone can think they are really good at optimisation, only to be blown away by someone else's orthogonal thinking.
    For example do you know about the crazy trick for speeding up an if-statement such as this (assume x is unsigned):
    Code:
    if (x == 2 || x == 3 || x == 5 || x == 7 || x == 11 || x == 13 || x == 17 || x == 19 || x == 23 || x == 29 || x == 31)
    It's this:
    Code:
    if (x < 32 && ((1<<x) & (1<<2)|(1<<3)|(1<<5)|(1<<7)|(1<<11)|(1<<13)|(1<<17)|(1<<19)|(1<<23)|(1<<29)|(1<<31)) != 0)
    which boils down to this:
    Code:
    if (x < 32 && ((1<<x) & 0xA08A28AC) != 0)
    So now you do.
    If you didn't know about that trick, then case in point, what other awesome tricks might you not be aware of!

    Anyone who truly wants well optimised code posts the actual code on here for others to do their insane magic on.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  3. #18
    Registered User
    Join Date
    Apr 2008
    Posts
    19
    For example do you know about the crazy trick for speeding up an if-statement such as this (assume x is unsigned):
    That old cookie? Can't you do better do that?

    Just kidding (sorry). That is pretty advanced.

    I'm kind of at the stage where I can start to do tricks like using pure malloced arrays instead of an array of structs, or using 1D contiguous arrays instead of 2D pointers to pointers kinda thing. And maybe using a relatively small sin/cos table (with linear interpolation for extra accuracy). A lot of tricks may however be a waste of time once the inevitable parallel processor paradigm comes into force within 5-10 years.
    Last edited by twinbee; 05-21-2010 at 02:02 PM.

  4. #19
    Registered User
    Join Date
    Apr 2008
    Posts
    19
    Just to update the situation of this problem: I have since upgraded from VS 2008 express to VS 2010 Professional and the problem still exists. Loop unswitching still isn't supported, at least judging by the simple example in my first post in this thread.
    Last edited by twinbee; 04-05-2011 at 12:14 PM.

  5. #20
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,288
    You still haven't shown that you proved it was a loop-unswitching issue by showing the generated assembly, nor have you shown that you manually implemented the loop unswitching in the C++ source code and got a performance increase.
    Until that happens you can't make any statement, and certainly not a general one, about whether the compiler did or did not perform loop-unswitching, or whether or not it is even advantageous to do so.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  6. #21
    Epy
    Epy is offline
    Fortran lover Epy's Avatar
    Join Date
    Sep 2009
    Location
    California, USA
    Posts
    943
    Not trying to troll, but ditch that MS piece of junk and use the Intel C++ compiler if you want performance.

  7. #22
    Registered User
    Join Date
    Apr 2008
    Posts
    19
    Huh, I thought we established that VS was ignoring the loop-unswitching optimization. Simply removing the If statement altogether shows that it does indeed run faster. I haven't gotten round to learning assembly (CUDA's on my agenda at the moment), but if you must prove it, it would only take you (or someone else) a moment, as the code I supplied is very short. I'm 95% sure (maybe down to 80% since you're expressing doubts).

    nor have you shown that you manually implemented the loop unswitching
    Yes I did. See my earlier post here in this very thread.

    I would count that as 'proof', unless there's something so deviously sneaky going on that I wouldn't have the words to express.

    Not trying to troll, but ditch that MS piece of junk and use the Intel C++ compiler if you want performance.
    That's all well and good, but it's a pretty simple feature, and considering that VS 2010 pro is so expensive, you'd think they'd have gotten their act together by now. More to the point, word spreads, and it'll hopefully slightly veer Microsoft to actually fixing it. Something is broken or lacking, and it's probably just loop unswitching, but if it's not that, then something screwy is definitely going on.

  8. #23
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,230
    Quote Originally Posted by twinbee View Post
    That's all well and good, but it's a pretty simple feature, and considering that VS 2010 pro is so expensive, you'd think they'd have gotten their act together by now. More to the point, word spreads, and it'll hopefully slightly veer Microsoft to actually fixing it. Something is broken or lacking, and it's probably just loop unswitching, but if it's not that, then something screwy is definitely going on.
    Loop unswitching is a very space-expensive optimization. Have you tried fiddling with the "Favor Size or Speed" setting in the C++ optimizations settings? Make sure it's set to "Speed" not "Size"
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  9. #24
    Registered User
    Join Date
    Apr 2008
    Posts
    19
    Well, "Space expensive" in that it can double, quadruple, or octuple a function's size which in many cases would still be absolutely fine.

    Yes, I tried setting the "Favor Size or Speed" setting to "Favour fast code" with no luck unfortunately.
    Last edited by twinbee; 04-05-2011 at 09:01 PM.

Page 2 of 2 FirstFirst 12
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. syntax question
    By cyph1e in forum C Programming
    Replies: 19
    Last Post: 03-30-2006, 11:59 PM
  2. return to start coding?
    By talnoy in forum C++ Programming
    Replies: 1
    Last Post: 01-26-2006, 02:48 AM
  3. Scope And Parameter Passing
    By djwicks in forum C Programming
    Replies: 6
    Last Post: 03-28-2005, 07:26 PM
  4. Loop Optimization & String Comparison.
    By Lithorien in forum C++ Programming
    Replies: 8
    Last Post: 08-09-2004, 06:00 PM
  5. for loop or while loop
    By slamit93 in forum C++ Programming
    Replies: 3
    Last Post: 05-07-2002, 04:13 AM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21