Thread: Various optimization questions

  1. #1
    </3 Segfaults
    Join Date
    Jul 2007
    Posts
    27

    Various optimization questions

    First of all, I'm using GCC. So all the questions I'm asking are GCC specific. Please don't complain that you can't ++ with bools, because GCC has that extension.

    1. How many bits (read, sizeof() is useless here) does a bool take up?

    2. Will x'oring a number with itself or setting it to zero be faster?

    3. With bool types, will "++" or "= 1" work faster? Or, is there an even quicker way to set a bool to TRUE?

    4. Will

    Code:
    if (imn == 4)
    imn = 2;
    else imn = 4;
    or

    Code:
    imn -= 2;
    imn = !imn;
    imn = 2 * imn + 2;
    run faster? Thanks.

  2. #2
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    1. sizeof(bool) * std::numeric_limits<char>::digits

    2. Depends.

    3. As you implied, "=1" is portable. "=true" is better. Might as well use it. If you're that worried about speed, test it.

    4. That second snippet looks scary. I'd go with
    Code:
    imn = (imn == 4) ? 2 : 4;
    Last edited by CodeMonkey; 07-29-2007 at 10:10 PM.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    1) The number of bits in a bool is sizeof(bool)*CHAR_BIT (from <climits>, or sizeof(bool)*std::numeric_limits<unsigned char>::digits (from <limits>). Note that the only guarantee on sizeof(bool) is that 1 <= sizeof(bool) <= sizeof(long).
    2) I would be shocked if xor-ing a number by itself was faster than just setting it to zero. It's certainly far more obscure.
    3) See 2).
    4) I'd be shocked, again, if the second version was faster than the far simpler and more straightforward first version.

    Edit: In support of 2), note that setting a variable to zero is common enough that the compiler should know the fastest way of doing it, if you just ask it to. And in support of 3), your second way of doing it involves subtraction, addition, and multiplication, and the first way involves simple comparisons and assignments. Although modern CPUs tend to be able to do some complex operations just as fast as simple ones, it's very unlikely that they would be faster.
    Last edited by robatino; 07-29-2007 at 10:21 PM.

  4. #4
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    > 1. sizeof(bool) * std::numeric_limits<char>::digits
    Signed char has one less digit (value bit) than unsigned char, and it's compiler-dependent whether char means signed char or unsigned char, so it's necessary to explicitly specify unsigned char.

    Edit: To qualify what I just said, I know that it's guaranteed that all the bits of an unsigned char are value bits (digits), but I think it's possible that a signed char could have padding bits, so the number of digits in a signed char could be less than the number of digits in an unsigned char, minus 1. Can someone confirm or deny this?
    Last edited by robatino; 07-29-2007 at 10:28 PM.

  5. #5
    Kiss the monkey. CodeMonkey's Avatar
    Join Date
    Sep 2001
    Posts
    937
    Good point. Whoops.
    "If you tell the truth, you don't have to remember anything"
    -Mark Twain

  6. #6
    </3 Segfaults
    Join Date
    Jul 2007
    Posts
    27
    Thanks for the input.

    1. Turns out it's 8 bits. Is there a way to make it just 1?

    2. I set it to "= FALSE" now.

    3. Now is "= TRUE".

    4. Now is "imn = 2 + (imn == 2) * 2", since I'm not quite sure what the ? and : operators do in CodeMonkey's snippet.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    The first thing you need to do is to learn that C++ is a compiled language, and not some oversized macro assembler that you need to micro manage.

    1. See answers

    2. The xor trick from asm only ever saved 1 byte of instruction space, and was never any quicker.
    Writing x = 0 is obvious to everyone (including the compiler).
    Writing x ^= x is undefined behaviour if x hasn't been initialised yet, and its type is anything other than unsigned char.

    3. ++ and +1 yield the same code for all practical purposes. If you're trying to go from false to true, then assignment works. If you're trying to implement a boolean toggle, then perhaps x = !x; would be better.

    4. Depends on your chosen architecture. I'd say case 1.
    But if your machine has excellent multiply and useless branch prediction then perhaps case 2.

    Very often, mistaken "attempts" to optimise things generally result in obfuscated code which confuse everyone and every thing. Clear simple code which does exactly what it looks like it should will give the compiler a much better chance at optimisation. And it will contain all the optimisation tricks that all the programmers who worked on the compiler know about, not just the ones you can dream up.

    Before you decide what to hack about, study what
    g++ -S prog.cpp
    generates for you.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Registered User
    Join Date
    Sep 2006
    Posts
    835
    > 2. I set it to "= FALSE" now.

    > 3. Now is "= TRUE".

    The two possible values of a bool are true and false (lower case). You don't need to define a macro to use those, in case that's what you did.

  9. #9
    </3 Segfaults
    Join Date
    Jul 2007
    Posts
    27
    Quote Originally Posted by Salem View Post
    The first thing you need to do is to learn that C++ is a compiled language, and not some oversized macro assembler that you need to micro manage.
    It's a competition to see who can generate the hugest list of primes in a given time period. It's gotten so bad that people are trying to optimize their programs just for the 20 second interval.

    Before you decide what to hack about, study what
    g++ -S prog.cpp
    generates for you.
    Thanks for the asm trick. =)

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > It's a competition to see who can generate the hugest list of primes in a given time period
    Then hit the web and try and find the best algorithm.

    Even the most ineptly implemented quicksort will eventually beat the most expertly crafted bubble sort for some given large data set.

    Or post your code, and let a wider audience pass judgement on what you have so far.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    The two biggest factors in calculating prime-numbers are:
    1. The time it takes to "test" if a particular number is a prime - unless you use "sieve of erastotenes(sp?)". [1]
    2. When your number of primes get large (some hundred thousands or so) you're looking at big cache-miss ratio. Sieve has a similar problem, as you need to walk through the bitmap many, many times, so if you get to larger numbers, you get very few writes per cache-load [thre may be a bit of optimization potential for using Non-temporal moves here - those don't affect the cache, and are probably more effective for this type of operation].

    As others have said, until you have your basic algorithm sorted out (as suggested, try finding algorithms on the web), you probably shouldn't worry about what the compiler does with various things - it will do whatever it thinks is best, and that's probably not going to change with "clever tricks", unless you KNOW what the compiler is going to do in the first place - and that takes experience in analyzing the compiler output from a particular compiler (and of course changes from one version to the next).

    I think your biggest problem (once you have some 100K or so of primes) is going to be cache misses, which tends to slow things down by a factor of 10 or so. If you can find a clever way to avoid those, then that's a benefit.

    [1] This method uses a bitmap/array of boolean to find primes by setting (or clearing) bits for numbers that are multiples of primes. It is mainly limited by the amount of memory your computer has - as once you go over that limit, the machine starts swapping, and as it's doing that, it takes 10000x (or so, give or take a few 10x) longer for each 4K segment of memory you need to swap in from disk. So the best you can get here is about 8*amount_of_memory_in_machine as the largest prime-number you can find. And of course, you need to leave some memory for your operating system and such.

    --
    Mats

  12. #12
    The larch
    Join Date
    May 2006
    Posts
    3,573
    unless you use "sieve of erastotenes(sp?)
    Atkins sieve (?) is said to be more efficient, but that is also mathematically more complex.

    A problem with a sieve could be that it is hard to have a time limit and then have the maximum list to show ready because you'd be aiming for a particular size from the beginning. Or may-be you can find a solution to that problem?
    I might be wrong.

    Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
    Quoted more than 1000 times (I hope).

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by anon View Post
    Atkins sieve (?) is said to be more efficient, but that is also mathematically more complex.

    A problem with a sieve could be that it is hard to have a time limit and then have the maximum list to show ready because you'd be aiming for a particular size from the beginning. Or may-be you can find a solution to that problem?
    No, I don't know a solution to that problem, other than to "guestimate the time needed and do the size necessary to complete in time" - it can be hard to do if you don't know how long you're supposed to run for before the actual competition, of course.

    --
    Mats

  14. #14
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    A hybrid approach is possible: use a sieve to its maximum, then use trial division with the primes already "generated".
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #15
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by laserlight View Post
    A hybrid approach is possible: use a sieve to its maximum, then use trial division with the primes already "generated".
    Good point, a small sieve to find the few thousand first primes and then division next. Since most non-primes are either divisible by 3 (33%), 5 (20%), or 7(14%) - that removes not quite 60% (because some are divisible by both 3 and 5 or 7 or all three, so you can't just add the percentages up) of all candidates in 3 divides (assuming candidates are all odd numbers, of course). Unfortunately, it goes downhill from there, as the next prime is 11, so less than 10% of the numbers go away here, and worse yet for 13, 17, and so on.

    The biggest problem is still that once the prime-array is large, all the numbers that have "big" factors, will take quite some beating to the caches in the processor.

    --
    Mats

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. questions....so many questions about random numbers....
    By face_master in forum C++ Programming
    Replies: 2
    Last Post: 07-30-2009, 08:47 AM
  2. Turn Off Optimization?
    By danlee58 in forum C Programming
    Replies: 6
    Last Post: 12-10-2008, 03:52 AM
  3. A very long list of questions... maybe to long...
    By Ravens'sWrath in forum C Programming
    Replies: 16
    Last Post: 05-16-2007, 05:36 AM
  4. Trivial questions - what to do?
    By Aerie in forum A Brief History of Cprogramming.com
    Replies: 23
    Last Post: 12-26-2004, 09:44 AM
  5. questions questions questions.....
    By mfc2themax in forum A Brief History of Cprogramming.com
    Replies: 1
    Last Post: 08-14-2001, 07:22 AM