Thread: Optimizing hot section of code.

  1. #16
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    615
    Maybe use a lea instruction? Still working on my analytical code before diving into converting it to assembly so not sure. Also I noticed you used a jz instead of jmp, why was that?

  2. #17
    Registered User
    Join Date
    May 2016
    Posts
    104
    I managed to "optimize" it even further into this stripped down version.
    Problem is, there was no change in performance. I can only assume this is what
    the compiler ended up with. flp1969 is right in that compilers have gotten really
    really good at optimizing code. Really.

    The only way I can see to make this faster is to eliminate the right shift operation;
    but then I would have to make the array humongus to accomodate the 64 bit indeces.
    I assume that alone would bring negative performance effects to other sections of
    of the program that work just fine now because they fit nicely in the cache.

    TBH, I didn't bother to test it. As it is now, I'm at quite close to Openssl's DES
    performance. And my base64 encoding is over 100% faster. Considering my code
    -although by no means perfect- is actually readable, and the unholy mess that
    the Openssl code base is, case and point, I'd say I'm pretty happy with the result.
    Code:
    static inline uint32_t    compress(const uint64_t block)
    {
        register uint32_t    compressed;
    
        compressed = 0;
        compressed |= (uint32_t)g_sboxes[0][block >> 58] << 28;
        compressed |= (uint32_t)g_sboxes[1][(block >> 52) & 0x3f] << 24;
        compressed |= (uint32_t)g_sboxes[2][(block >> 46) & 0x3f] << 20;
        compressed |= (uint32_t)g_sboxes[3][(block >> 40) & 0x3f] << 16;
        compressed |= (uint32_t)g_sboxes[4][(block >> 34) & 0x3f] << 12;
        compressed |= (uint32_t)g_sboxes[5][(block >> 28) & 0x3f] << 8;
        compressed |= (uint32_t)g_sboxes[6][(block >> 22) & 0x3f] << 4;
        compressed |= (uint32_t)g_sboxes[7][(block >> 18) & 0x3f];
        return (compressed);
    }
    Next I have to implement the RSA section, which entails recreating the genrsa,
    rsa and rsautl Openssl commands with all their options, as well as creating my own
    random number generator. I fear I'm not quite up to the task
    printf("I'm a %s.\n", strrev("Dren"));

  3. #18

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Help optimizing a code !!
    By thebenman in forum C Programming
    Replies: 9
    Last Post: 11-25-2014, 11:45 AM
  2. Optimizing code
    By KBriggs in forum C Programming
    Replies: 43
    Last Post: 06-05-2009, 04:09 PM
  3. Optimizing my code
    By Lina in forum C Programming
    Replies: 30
    Last Post: 10-22-2006, 01:31 PM
  4. Need help with section of Code
    By Harkin1987 in forum C Programming
    Replies: 9
    Last Post: 08-27-2006, 01:55 PM
  5. Help with section of code.
    By unejam2005 in forum C++ Programming
    Replies: 3
    Last Post: 12-11-2005, 06:38 PM

Tags for this Thread