Thread: Different ways to add

  1. #1
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218

    Different ways to add

    I made a function to add two numbers together using bit manipulation. I started off writing the function add_a, then made a modified version add_b.

    add_a seems to be 10-15% faster than add_b. The thing is is that in add_b I removed a variable, and a reset and a copy from in the loop. So why does it appear slower?
    Code:
    #include <stdio.h>
    #include <limits.h>
    #include <stdlib.h>
    #include <time.h>
    #define Uint8 unsigned char
    
    Uint8 add_a(Uint8 a, Uint8 b)
    {
        static Uint8 i, num = 0;
        static Uint8 result = 0;
        static Uint8 carry = 0;
        static Uint8 bit = 1;
    
        for(i=0; i<CHAR_BIT; i++)
        {        
            num = 0;
            if(a & bit) num++;
            if(b & bit) num++;
            if(carry)   num++;    
            if(num & 1) result |= bit;
            carry = (num & 2);
            bit <<= 1;
        }  
    
        return result;  
    }
    
    Uint8 add_b(Uint8 a, Uint8 b)
    {
        static Uint8 i, num = 0;
        static Uint8 result = 0;
        static Uint8 bit = 1;
    
        for(i=0; i<CHAR_BIT; i++)
        {
            if(a & bit)num++;
            if(b & bit)num++;
            if(num & 1) result |= bit;
            num = (num & 2) ? 1: 0;
            bit <<= 1;
        }  
        return result;  
    }
    
    Uint8 add_c(Uint8 a, Uint8 b)
    {
        static Uint8 i, num = 0;
        static Uint8 result = 0;
        static Uint8 bit = 1;
    
        for(i=0; i<CHAR_BIT; i++)
        {
            if(a & bit)
                if(b & bit)
                    num += 2;
                else
                    num += 1;
            else
                if(b & bit)
                    num += 1;
                    
            if(num & 1) result |= bit;
            num = (num & 2) ? 1: 0;
            bit <<= 1;
        }  
        return result;  
    }
    
    void test(int x, int y)
    {
        unsigned start = clock(), i;
        for(i=0; i<50000000; i++) add_a(x, y);
        printf("A: <Time %u> <Result %u>\n", clock()-start, add_a(x, y));
        start = clock();
        for(i=0; i<50000000; i++) add_b(x, y);
        printf("B: <Time %u> <Result %u>\n", clock()-start, add_b(x, y));
        start = clock();
        for(i=0; i<50000000; i++) add_c(x, y);
        printf("C: <Time %u> <Result %u>\n", clock()-start, add_c(x, y));
    }
    
    int main()
    {
        Uint8 a = 102;
        Uint8 b = 57;
        test(a, b);
        getchar();
    }

  2. #2
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    probably branch prediction on if(carry) was better when on (num & 2) ? 1: 0;

    to make more exsact decision you need to use tool like VTune of Intel
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Try removing those "static" and see if that makes any difference. It should help the compiler optimize those loops a bit further because there is no need to STORE the old value. [If you want carry to carry through to the next level, then you need that static].

    And try this
    Code:
             result |= (num >> i) & 1;
    This may be better too:
    Code:
    // instead of:  num = (num & 2) ? 1: 0;
            num = (num >> 1) & 1;

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Just out of curiosity, what's wrong with plain old: a + b?

  5. #5
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    vart: Yeah I guess my testing method dosent say a whole lot. I had a look at VTune. I got an AMD tho.

    matsp: Removing the static declaration dosent seem to make much difference. In fact I only added the 'static' bit at the end as it made the function much faster when repeatedly called. I don't know if thats a good thing tho. I don't want to store the value; it just saves time allocating space for the variables (or thats the impression I get). Also, shifting then &ing dident seem to make much difference either, but thanks for the suggestion I hadent thought of that.

    cpjust: I was planning on using it (or something like it) for adding big numbers which would be made out of a string of bytes.

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    On my AMD processor, the B version was the fastest by about 10% - but it probably depends on which compiler and which exact model of processor you have. This was using Visual Studio .Net.

    I noticed when I was trying it out, that the static seems to help performance. I don't really understand why tho'. But I didn't spend much time studying the generated assembly code, just trying some random changes out. [I managed to get it to run in "zero" time, but that was just a case of the optimizer completely removing the function].

    I would just go with one of your versions for now, if you are planning to make it a bigger thing - you may find that the best optimization is to do something "further up the chain" to make bigger benefits, rather than micro-optimizing this function "to death".

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    Yeah, I guess its hardware/compiler specific. I'm using GCC. Also I had a go at using declaring variables as 'register' instead of 'static'. (Something I found in the bowels of the GNU C language extensions). Bizzarely it makes it a whole lot slower than normal variables even when I am only requesting one register.

    I think you are probably right in that theres little point worrying about small speed differences here. I think i may go with version a as returning the carry would be handy, although if its static its not like it really matters too much anyway.

    Cheers.

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Also be aware that using static may prevent the function from inlining, so if that's a goal later on, using the "slower" version with non-static variables may turn out to be faster in the long term.

    Using register is likely to "confuse" the compilers optimizer. It was a good thing in the past when the compilers weren't very good at figuring out which register can do what. Nowadays, the optimizer is most likely better than us to figure out which register to use for what and when [also, you assigning a variable register may actually prevent ANOTHER variable or temporary result that is more frequently used from using a register, which of course is not what you wanted].

    By the way, which verison of gcc is this?

    For an alternative to VTune, try AMD's CodeAnalyst, or oprofile if you are on Linux. It's a while since I last used it, but CA is able to tell you exactly how long each instruction is taking.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    Yeah, it seems better to forget about the register thing. Tbh I'm not sure exactly which version gcc i'm using. I think its an old one, like 3.4.2 or something. IE: the version that came with Dev C++. Never bothered updating it.

    Just tried downloading CodeAnalyst from AMD's site, but it gave me some XML parsing error thing. Still might see if I can get hold of it elsewhere. Might be fun to have a play around with.

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I can't download it either - I'm reporting to the support page on the CA page.

    gcc 3.4.2 may not have the optimization to replace branches with conditional execution code, which could slow things down a bit in this type of code.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  11. #11
    Dr Dipshi++ mike_g's Avatar
    Join Date
    Oct 2006
    Location
    On me hyperplane
    Posts
    1,218
    Hmm. Maybe I should update it. I have gcc 4.2.2 on my linux computer, I'll test it on that later see if it makes a difference.

    Cheers.

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I really wouldn't worry about it right now - I'm not sure there is a mingw version that works well from gcc 4.x anyways. But if it's easy to check, why not...

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. can not add member error
    By WaterNut in forum Windows Programming
    Replies: 4
    Last Post: 12-02-2004, 01:18 PM
  2. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  3. VC++ Resources - Add Existing Item
    By EliMcGowan in forum Windows Programming
    Replies: 1
    Last Post: 08-30-2004, 02:51 PM
  4. Add and delete functions
    By Ana Val sazi in forum C++ Programming
    Replies: 5
    Last Post: 06-18-2002, 09:59 PM