Thread: Can I decrease run time / initialization time with threads?

  1. #1
    Registered User
    Join Date
    May 2015
    Posts
    65

    Can I decrease run time / initialization time with threads?

    I wrote this small piece of code yesterday for the fun of it, after which I set my self a challenge, mainly to do everything in my power to bring down the initialization time, what was initially 40 seconds.

    This I have done, and it's now around 18 seconds, but I think in theory it can be done even faster using threads, though I'm somewhat doubtful when it comes to reality.

    Anyway, first the code:

    Code:
    #include <stdint.h>
    
    // For convenience, the width of the desired type.
    #define WORD_BITS ( 8 * sizeof( uint_fast32_t ) )
    #define ARRAY_SIZE ( 0x10000000 / sizeof( uint_fast32_t ) )
    #define ARRAY_SIZE_BITS ( 0x80000000 )
    
    // allocate 2^31 bits of memory, to represent all uneven numbers 
    // from 0 -> 2^32-1
    static uint_fast32_t b[ ARRAY_SIZE ];
    
    // should be selfexplanatory. Flip bits, and and assign.
    static inline void resetBit( uint_fast32_t idx )
    {
        b[ idx / WORD_BITS ] &= ~( ( uint_fast32_t ) 1 << ( idx % WORD_BITS ) );
    }
    
    // This is a bit more complicated, as I don't really need to return 0 or 1,
    // but will be perfectly happy with 0 and !0. Thus there are two ways:
    // 1&(x>>n) or x&(1<<n) and I would very much like to know which is more
    // efficient.
    // edit: after testing seems there's no difference.
    static inline uint_fast32_t getBit( uint_fast32_t idx )
    {
        return ( ( uint_fast32_t ) 1 & ( b[ idx / WORD_BITS ] >> ( idx % WORD_BITS ) ) );
    }
    
    void initialize()
    {
        // set all bits to 1.
        for( uint_fast32_t n = 0; n < ARRAY_SIZE; n++ )
            b[ n ] = UINT_FAST32_MAX;
        uint_fast32_t p;
        // loop through all numbers from 0 -> 2^16.
        for( uint_fast32_t n = 0; n < 0x8000; n++ )
        {
            // If prime...
            if( getBit( n ) )
            {
                // calculate the actual value.
                // Also, would the compiler automatically optimize, if p
                // were to be eliminated in the for loop?
                p = 2 * n + 3;
                 // loop through the bitset and reset required bits.
                for( uint_fast32_t i = p+n; i < ARRAY_SIZE_BITS; i += p )
                    resetBit( i );
            }
        }
    }
    So, we basically have a rather large array of numbers, which is iterated through a large amount of times, and for each element we manipulate some data. The higher our prime or 'n', the quicker the iterations get.
    I suspect there wouldn't be too much to be gained, but if possible, and depending how the OS is handing the workload, having 2 or more threads working in parallel, should still make a difference.
    The only problem I see really, is working on the same data at the same time, which I see ways around. Also, you have to make sure the threads don't end up iterating using non primes, but that shouldn't be hard.

    However, as I've never worked with threads before and it is a complicated subject, I thought it best to ask before I start pulling my hair out, due to an impossible task I've set my self.

    There is also a bonus question or two, which relates to the LLP64, that is windows 64 bit architecture and types.
    On my system I can count on a long being 64 bit, and in fact int_fast32_t is synonymous with long, or long int if you wish:

    From stdint.h
    Code:
    typedef unsigned long int    uint_fast32_t;
    typedef unsigned int        uint_fast32_t;
    but as I understand it, things works a bit different on windows, though I've been unable to find definitive. So basically, if someone could do a sizeof(int_fast32_t) and sizeof(long) on a win64 system for me, I should get my answer.
    I don't mind using stdint.h, but it seems rather silly if long does the same job.

    Also, the cast I have to do int getBit() and ResetBit() is not exactly pretty. I could simply do 1LL, or 1ULL I suppose, but how would that work out on non 64 bit architectures? Would it simply be ignored, in cases where uint_fast32_t is in fact 32_bit?

    Other than that, ideas are welcome and I bid you farewell for now.

  2. #2
    Registered User
    Join Date
    May 2015
    Posts
    65
    To those who might be interested, here's the primarity checker function in it's current form. Seems to work.

    Code:
    uint_fast32_t isPrime(unsigned long long n)
    {
        unsigned long long p = 3;
        if( !( n % 2ull ) ) return 0;
        for( uint_fast32_t i = 0; p * p < n; p = 2 * ++i + 3 )
            if( getBit( i ) && !( n % p ) ) return 0;
        return 1;
    }

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    what about case when p*p == n? is it handled somehow?
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  4. #4
    Registered User
    Join Date
    May 2015
    Posts
    65
    Quote Originally Posted by vart View Post
    what about case when p*p == n? is it handled somehow?
    One of several mistakes already corrected.

    Of course p*p <= n

    I've also inverted the initialization process, thus setting bits instead of resetting them, thus saving, I thing, one instruction per call. at the end of the initialization, obviously the array will be flipped, but that's hardly demanding.

    Code:
    #include <stdint.h>
    
    // For convinience, the width of the desired type.
    #define WORD_BITS ( 8 * sizeof( uint_fast32_t ) )
    #define ARRAY_SIZE ( 0x10000000 / sizeof( uint_fast32_t ) )
    
    
    // allocate 2^31 bits of memory, to represent all uneven numbers from 0 -> 2^32-1
    static uint_fast32_t b[ ARRAY_SIZE ] = { 1 };
    
    
    // should be selfexplanatory. shift left and or to assign.
    static inline void setBit( uint_fast32_t idx )
    {
        b[ idx / WORD_BITS ] |= ( uint_fast32_t ) 1 << ( idx % WORD_BITS );
    }
    // Retrieve bit by right ........f and.
    static inline uint_fast32_t getBit( uint_fast32_t idx )
    {
        return ( ( uint_fast32_t ) 1 & ( b[ idx / WORD_BITS ] >> ( idx % WORD_BITS ) ) );
    }
    
    
    void initialize()
    {
        uint_fast32_t p;
        // loop through all numbers from 0 -> 2^16.
        for( uint_fast32_t n = 1; n < 0x8000; n++ )
        {
            // If not prime...
            if( getBit( n ) ) continue;
    
    
            // calculate the actual value
            p = 2 * n + 1;
            // loop through the bitset and reset required bits.
            for( uint_fast32_t i = p + n; i < 0x80000000; i += p )
                setBit( i );
        }
        // flip all bits
        for( uint_fast32_t n = 0; n < ARRAY_SIZE; n++ )
            b[ n ] ^= UINT_FAST32_MAX;
    }
    
    
    uint_fast32_t isPrime(unsigned long long n)
    {
        if( n == 2 ) return 1;
        else if( !( n % 2ull ) || n < 2 ) return 0;
        unsigned long long p = 3;
        for( uint_fast32_t i = 1; p * p <= n; p = 2 * ++i + 1 )
            if( getBit( i ) && !( n % p ) ) return 0;
        return 1;
    }
    Of course much of what I've done amkes no difference, as the compiler is rather good at optimizing, but I am down to 16-17 sec

  5. #5
    Registered User
    Join Date
    May 2015
    Posts
    65
    sqrt(n) is need for large 64 bit primes:

    Code:
    uint_fast32_t isPrime(unsigned long long n){
        if( n == 2 ) return 1;
        else if( !( n % 2ull ) || n < 2 ) return 0;
        uint_fast32_t sqp = sqrt( n );
        unsigned long long p = 3;
        for( uint_fast32_t i = 1; p <= sqp; p = 2 * ++i + 1 )
            if( getBit( i ) && !( n % p ) ) return 0;
        return 1;
    }
    other than that, everything seems to work.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 5
    Last Post: 04-17-2013, 11:32 PM
  2. Replies: 2
    Last Post: 04-17-2013, 12:25 AM
  3. help with time.h and threads
    By thewhisper in forum C Programming
    Replies: 8
    Last Post: 02-24-2013, 04:02 PM
  4. run-time var initialization
    By Kempelen in forum C Programming
    Replies: 4
    Last Post: 06-23-2010, 04:40 AM
  5. How to decrease a time string by a given value
    By Ruchikar in forum C Programming
    Replies: 3
    Last Post: 04-11-2002, 12:08 PM