Prime Sieve Optimization

This is a discussion on Prime Sieve Optimization within the C++ Programming forums, part of the General Programming Boards category; I'm working on some old acm programming problems and one of them requires you to find all the primes upto ...

  1. #1
    Registered User
    Join Date
    May 2008
    Posts
    15

    Prime Sieve Optimization

    I'm working on some old acm programming problems and one of them requires you to find all the primes upto 2^31 - 1. I've coded a pretty literal interpretation of the sieve of atkin and it takes about 41 seconds on my computer, however the time limit the acm site gives when you submit your code is 10 seconds so I must be doing something wrong. I was wondering if some of you could look at it and offer some tips on how to speed it up.

    Code:
    #include <iostream>
    #include <vector>
    #include <cmath>
    #include <limits>
    #include <sys/time.h>
    using namespace std;
    
    void primes(int limit)
    {
        vector<bool> bits(limit + 1, false);
        int sqrt_limit = sqrt(static_cast<double>(limit));
    
        for(int x = 1; x <= sqrt_limit; ++x)
        {
            for(int y = 1; y <= sqrt_limit; ++y)
            {
                int x_squared = x * x, y_squared = y * y;
                int r = 4 * x_squared + y_squared;
    
                if(r <= limit && r % 12 == 1 && r % 12 == 5)
                    bits[r].flip();
    
                int x_squared_times_3 = x_squared * 3;
                int s = x_squared_times_3 + y_squared;
    
                if(s <= limit && (s % 12) == 7)
                    bits[s].flip();
    
                int t = x_squared_times_3 - y_squared;
                if(x > y && t <= limit && (t % 12) == 11)
                    bits[t].flip();
    
            }
        }
    
        for(int n = 5; n <= sqrt_limit; ++n)
        {
            if(bits[n])
            {
                int n_squared = n * n;
                int m_max = limit / n_squared;
                for(int m = 1; m <= m_max; ++m)
                    bits[n_squared * m] = false;
            }
        }
    
        bits[2] = true;
        bits[3] = true;
    }
    
    int main()
    {
        struct timeval tv1, tv2;
        gettimeofday(&tv1, NULL);
    
        primes(numeric_limits<int>::max());
    
        gettimeofday(&tv2, NULL);
        long seconds  = tv2.tv_sec  - tv1.tv_sec;
        long useconds = tv2.tv_usec - tv1.tv_usec;
    
        cout << "Elapsed time: " << seconds << " seconds + " <<  useconds << " microseconds" << endl;
    
        return 0;
    }
    Thanks

  2. #2
    Amazingly beautiful user.
    Join Date
    Jul 2005
    Location
    If you knew I'd have to kill you
    Posts
    254
    Your code is a near replica of the wikipedia pseudocode. Note the following from wikipedia:
    This pseudocode is written for clarity. Repeated and wasteful calculations mean that it would run slower than the sieve of Eratosthenes. To improve its efficiency, faster methods must be used to find solutions to the three quadratics. At the least, separate loops could have tighter limits than [1, √limit].
    Try writing your own version of the Seive of Eratosthenes, as it is simpler to implement. A well written implementation of it will run circles around the above "clarity targeted" code, while the above code requires some major changes to effectively implement the more complex Sieve of Atkin speed-wise.
    Programming Your Mom. http://www.dandongs.com/

  3. #3
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Katy, Texas
    Posts
    2,309
    x_squared and x_squared_times_3 do not change inside the inner y loop, so there's no sense in recalculating them again and again and again in the y loop. Do them once in the outer x loop.

    Start with the above and see what difference that makes. I suspect it will make a some difference, but probably not enough.

    I wouldn't use a vector for keeping track of 2^31 bits. Just get a hunk of heap and index into it.

    But, as with any performance problem, running a profiler is always a better tactic than eyeballs or guessing.

    Todd
    Mac and Windows cross platform programmer. Ruby lover.

    Quote of the Day
    12/20: Mario F.:I never was, am not, and never will be, one to shut up in the face of something I think is fundamentally wrong.

    Amen brother!

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,295
    Code:
                if(r <= limit && r % 12 == 1 && r % 12 == 5)
    How can r % 12 ever equal both 1 and 5 simultaneously?
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    Registered User
    Join Date
    May 2008
    Posts
    15
    @CrazyNorman:
    Thanks for the idea, I tried that and I got the time down to 34 seconds.
    @Todd:
    The assembly output stayed the same when I removed them from the inner loop so it looks like gcc was optimizing it away. I also tried using bool *bits = new bool[limit + 1] and it was actually slower then the vector. Can you tell me how to analyze a function in gprof? I've tried searching google but I must be using the wrong terms because I come up with results about analyzing time spent in different functions.
    @iMalc:
    Your right of course, I misread the or as an and. Unfortunately, this more then doubled the time it takes.

    Thanks for all the suggestion so far.

  6. #6
    Amazingly beautiful user.
    Join Date
    Jul 2005
    Location
    If you knew I'd have to kill you
    Posts
    254
    When you reserve an array that large, if every "bool" takes up one byte, then the array of bools will take up 4 GB. Most machines don't have this much RAM, so you end up with swapping, and a general slow down. std::vector<bool> on the other hand implements an optimization, where eight bools are crammed into one byte (because you actually only need a bit for each one), thereby making the entire array only require 512 mb, which most computers will fit in RAM, or atleast have an easier time attempting to do. Chances are, std::vector<bool> will therefore provide a significant performance boost.
    Programming Your Mom. http://www.dandongs.com/

  7. #7
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    I think std::vector<bool> has some problems (I heard), so consider std::bitset too.

  8. #8
    Registered User
    Join Date
    Jan 2005
    Posts
    7,317
    >> I think std::vector<bool> has some problems (I heard)
    The problems are that it doesn't work like other vectors. If you don't try to use it like other vectors, then you'll be ok.

  9. #9
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,295
    Quote Originally Posted by rt454 View Post
    @iMalc:
    Your right of course, I misread the or as an and. Unfortunately, this more then doubled the time it takes.
    Just make sure when you fix it that you also put in parenthesis around the OR statement as AND has higher precedence.
    You could use the cool optimisation trick, that checks for being equal to 1 or 5 in one check:
    Code:
    if(r <= limit && 1<<(r % 12) & ((1<<1)|(1<<5)))
    I doubt it'll shave off much time, but it's the first time I've found an appropriate place for that trick, so I'd do it for the coolness factor.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #10
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Katy, Texas
    Posts
    2,309
    Quote Originally Posted by rt454 View Post
    @Todd:
    ...Can you tell me how to analyze a function in gprof? I've tried searching google but I must be using the wrong terms because I come up with results about analyzing time spent in different functions.
    No, I can't. Not yet anyway. Still on my list to learn too.
    Mac and Windows cross platform programmer. Ruby lover.

    Quote of the Day
    12/20: Mario F.:I never was, am not, and never will be, one to shut up in the face of something I think is fundamentally wrong.

    Amen brother!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Calculating prime factor of a huge number.
    By Bakster in forum C Programming
    Replies: 15
    Last Post: 02-20-2009, 11:06 AM
  2. prime number program with function
    By mackieinva in forum C Programming
    Replies: 17
    Last Post: 09-20-2007, 08:36 AM
  3. prime numbers, counters, help!
    By vege^ in forum C++ Programming
    Replies: 1
    Last Post: 03-10-2003, 03:32 PM
  4. Homework help
    By Jigsaw in forum C++ Programming
    Replies: 2
    Last Post: 03-06-2002, 04:56 PM
  5. sieve of Erotosthenes (HELP!!!)
    By JFK in forum C Programming
    Replies: 1
    Last Post: 02-19-2002, 11:53 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21