Thread: Sieve of Eratosthenes

  1. #1
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629

    Sieve of Eratosthenes

    Code:
    #include <iostream>
    #include <list>
    using namespace std ; 
    #define SIZE 10
    int main()
    {
        list<unsigned long int> sieve ;
        
        list<unsigned long int>::iterator it ; 
        long int i =2 ;
        
        while(i<SIZE)
        {
            sieve.push_back(i) ; 
            i++ ; 
        }
         
        it = sieve.begin() ;
        i=*it;
        
    
        unsigned long int sum =0;
        
        while(i*i < SIZE)
        {
            i = *it ;
            
            sieve.push_back(*it) ;
    	
    	it = sieve.erase(it) ;
            
    		
            while(it!=sieve.end())
            {
               if(i!=*it&&(*it)%i==0)
               {
                
                    it = sieve.erase(it) ; 
                   
                }
                else
                { 
    		 it++ ;
    		
                 }
          
            }
       
    	it = sieve.begin() ;
        } 
    	
        it = sieve.begin() ; 
      
    	
        while(it!=sieve.end())
        {
    	sum+=*it ;
           
            it++ ;
        }
    
        
        cout << "Sum of all prime is: " << sum ;
        cin.get() ; 
        return 0 ;
    }
    this is attempt for sieve of Eras algorithm...but it works freaking slow for numbers above 10000..i was wondering is there a way to make it faster (I am yet to try the Euler sieve algorithm..but I don't see how it is any different)? I don't think it would make any difference if i used vectors..and i don't know how to use vectors anyway.

    One more thing..when I remove the multiples from a list and i increment the iterator, the program crashes i had to use an else for that, to increment the iterator if an element is not erased.

    I don't understand why it crashes, i mean
    i would safely assume that when I removed from a list and incremented the list thereafter the iterator would point to valid address if there was an element after it. But since I can't be sure, it is better to use the else to increment the iterator if not erased... :S
    Last edited by Eman; 01-24-2011 at 08:59 AM.
    You ended that sentence with a preposition...Bastard!

  2. #2
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    is there really no way to optimize it?
    anyhow, can someone at least clear up the iterator problem.
    Thanks.
    You ended that sentence with a preposition...Bastard!

  3. #3
    Password:
    Join Date
    Dec 2009
    Location
    NC
    Posts
    587
    You may want to look at the method I talked about here: How to use large numbers It is similar to the sieve in that is eliminates numbers based on their prime divisors. For more info, read my posts there.

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Don't use a list for this.
    Lists have the overhead of two pointers per item, which is pretty large considering you really only need a single bit to indicate whether a number is marked off or not.

    With a list you obviously find yourself having to skip over many items to get to each multiple of i. With a vector you just go straight to that item and mark it off.

    Apart from 2 (special case), you only need consider odd numbers.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    With a list you obviously find yourself having to skip over many items to get to each multiple of i. With a vector you just go straight to that item and mark it off.
    I don't understand what you mean "skip over many items" .

    @User Name, I will have a look. thanks
    You ended that sentence with a preposition...Bastard!

  6. #6
    Password:
    Join Date
    Dec 2009
    Location
    NC
    Posts
    587
    He's talking about access times. Lists have to be traversed to access elements, so to access the n element, you first have to access n - 1 elements to get the pointer to the nth element. But with a vector, access time is always 1, and is therefore the better choice since you will be accessing elements linearly. For example, here's a chart, n is the index of the element, l is the iterations it takes to access n with a list, and v is the iterations it takes to access with a vector.
    Code:
    n  l  v
    1  1  1
    2  2  1
    3  3  1
    4  4  1
    ...
    x  x  1
    The last being the generalization.

    Lists are better for insertion(O(1)), because with vectors, you have to shift everything to accommodate the new element. Vectors are better for access because all the elements are in order in memory like arrays.

    You should take some time to learn about the properties of different data structures, not necessarily now, but sometime.
    Last edited by User Name:; 01-24-2011 at 03:34 PM.

  7. #7
    -bleh-
    Join Date
    Aug 2010
    Location
    somewhere in this universe
    Posts
    463
    Quote Originally Posted by User Name: View Post

    Lists are better for insertion(O(1)), because with vectors, you have to shift everything to accommodate the new element. Vectors are better for access because all the elements are in order in memory like arrays.

    You should take some time to learn about the properties of different data structures, not necessarily now, but sometime.

    Technically, you still have O(1) if you use back_inserter for vector. Front_inserter you have deque. List offers the benefit of O(1) insert in the middle of the sequence.
    Last edited by nimitzhunter; 01-24-2011 at 03:59 PM.
    "All that we see or seem
    Is but a dream within a dream." - Poe

  8. #8
    Password:
    Join Date
    Dec 2009
    Location
    NC
    Posts
    587
    Quote Originally Posted by nimitzhunter View Post
    Technically, you still have O(1) if you use back_inserter for vector. Front_inserter you have deque. List offers the benefit of O(1) insert in the middle of the sequence.
    Yes... but the point was to give a simple overview, that ignores special cases for the sake of simplicity.

    I guess that is an important special case for this application. In order to take advantage of the O(1) efficiency of back insertion, you have to use push_back(), and you already are so there's nothing to worry about.

  9. #9
    -bleh-
    Join Date
    Aug 2010
    Location
    somewhere in this universe
    Posts
    463
    Quote Originally Posted by iMalc View Post
    Don't use a list for this.
    Lists have the overhead of two pointers per item, which is pretty large considering you really only need a single bit to indicate whether a number is marked off or not.

    With a list you obviously find yourself having to skip over many items to get to each multiple of i. With a vector you just go straight to that item and mark it off.

    Apart from 2 (special case), you only need consider odd numbers.
    Actually, the list of primes is better in this case. Since you dont' know how many primes will be in the range [2,max], so you have no way to reserve the size for the vector. And as the number of primes grow, the vector may have to be reallocated which is a waste of time. The seive can be implemented as an array to type bool instead of unsigned long.

    Code:
    list<unsigned long> prime_gen(unsigned long max)
    vector<bool> seive(max, true);
    list<unsigned long> primes; 
    
    // then iterate over the range [2,max]
    for ( unsigned long i = 2; i <= max; ++i)
    
        if( seive[i] )
        {
             primes.push_back(i)
              // then mark off the multiples; 
                 for ( unsigned long j = i; j <= max; j+=i)
                        seive[j]=0
         }
    }
    There may be a faster way.
    Last edited by nimitzhunter; 01-24-2011 at 06:36 PM.
    "All that we see or seem
    Is but a dream within a dream." - Poe

  10. #10
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    thanks for the reply guys..but I have to say I have no freaking idea what is going :O
    i actually got to 2,000,000 a bit slow but it works way better than i had it using nimitz idea of 1's.
    but i don't understand your recent post
    Actually, the list of primes is better in this case. Since you dont' know how many primes will be in the range [2,max], so you have no way to reserve the size for the vector. And as the number of primes grow, the vector may have to be reallocated which is a waste of time.
    doesn't a vector grow like a list does, apart from the fact that it is linear like an array. but it still resizes on a push.

    Code:
    list<unsigned long> prime_gen(unsigned long max)
    what is this for? and what does it mean.... especially
    prime_gen(unsigned long max)
    But i think I got the idea for the rest...
    thanks
    You ended that sentence with a preposition...Bastard!

  11. #11
    -bleh-
    Join Date
    Aug 2010
    Location
    somewhere in this universe
    Posts
    463
    Quote Originally Posted by Eman View Post
    but i don't understand your recent post

    doesn't a vector grow like a list does, apart from the fact that it is linear like an array. but it still resizes on a push.
    Yeah, it does resize. However, if its capacity is exceeded, it has to sometimes reallocate to another block of memory and copy everything form the old block to the new block.

    Code:
    list<unsigned long> prime_gen(unsigned long max)
    what is this for? and what does it mean.... especially
    prime_gen(unsigned long max)
    But i think I got the idea for the rest...
    thanks
    Oh, that's just my function call, take in max and return the list of of unsigned long. Just ignore that if you don't want to use function call.

    Edit:: I have to eat my own words. I tried both list and vector, it took about the 5 sec to generate a list of primes under 20 million for each container.
    Last edited by nimitzhunter; 01-24-2011 at 08:46 PM.
    "All that we see or seem
    Is but a dream within a dream." - Poe

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by nimitzhunter View Post
    Yeah, it does resize. However, if its capacity is exceeded, it has to sometimes reallocate to another block of memory and copy everything form the old block to the new block.
    Doesn't matter. It's still not worse than a list when you use it for simply an unsigned long. Each inserted item is only copied an average about two times.

    E.g. a likely implementation.
    initial size = say 8 items.
    push_back 9th item and 8 items are copied as the vector grows to 16 items.
    push_back 17th item and 16 items are copied as the vector grows to 32 items.
    push_back 33rd item and 32 items are copied as the vector grows to 64 items.
    push_back 65th item and 64 items are copied as the vector grows to 128 items.
    push_back up to say 100 items and now count how many copies occurred in total from the relocation.
    8+16+32+64 = 120. Hmm, not bad for pushing back only 100 items! Each item is copied once into the vector initially, and then another 120/100 = 1.2 times on average, giving 2.2 times in total.
    Anyone who thinks a vector is slower overall due to the resizing it does, simply doesn't understand amortised complexity.
    Copying an unsigned long twice is still better than inserting a node into a list which requires a memory allocation of no less than 6 bytes each time (on most OSes this is rounded up to a multiple of 8, and the CRT usually adds some as well at least in debug builds), not to mention the cache unfriendliness that follows.

    A vector is simply not as slow as a lot of people think it is.
    Last edited by iMalc; 01-25-2011 at 01:11 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by Eman View Post
    I don't understand what you mean "skip over many items".
    I'm referring to this loop:
    Code:
            while(it!=sieve.end())
            {
               if(i!=*it&&(*it)%i==0)
               {
    
                    it = sieve.erase(it) ; 
    
                }
                else
                { 
    		 it++ ; //here
    
                 }
    
            }
    Notice how you skip over items that are not a multiple of i (marked with a comment above)? For a vector you don't have to skip anything! You index straight to and mark off exactly every multiple of i.
    The Sieve of Eratosthenes is not ever supposed to contain the slow % operator pictured in the loop above. It is supposed to have a fixed length array of flags, and the flags are simply marked off by repeatedly indexing into the vector. Nothing should be erased as you go. What you've written is effectively just a complicated version of the trial division method. That's half the reason why it's so slow.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  14. #14
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    If you'd like to see how to properly implement the algorithm, here are three algorithms for comparison from my website. The bottom one is the Sieve of Eratosthenes

    Ignore the templating and just pretend T is an unsigned long. The templating is just so that it can generate 16-bit only primes if desired, or primes large enough to require an __int64 etc.

    Sorry for making three posts. I didn't want to keep adding to the one post indefinitely when each post's content was fairly distinct anyway.
    Last edited by iMalc; 01-25-2011 at 01:12 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #15
    The Dragon Reborn
    Join Date
    Nov 2009
    Location
    Dublin, Ireland
    Posts
    629
    thanks guys for replying..obviously I have to read up on the stl containers..
    and I also have to learn how to understand and implement efficient algorithms, that's what I want to know..how can I make a program, fast.

    if you know of any good books that may help me, that would great.
    Thanks nimitzhunter and iMalc.
    You ended that sentence with a preposition...Bastard!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sieve of Eratosthenes
    By darren78 in forum C++ Programming
    Replies: 0
    Last Post: 04-28-2010, 10:52 AM
  2. Sieve of Eratosthenes
    By jack_carver in forum C++ Programming
    Replies: 7
    Last Post: 04-13-2010, 10:33 PM
  3. Replies: 2
    Last Post: 12-24-2009, 02:41 AM
  4. Primes with Sieve of Eratosthenes.
    By opciok95 in forum C Programming
    Replies: 20
    Last Post: 11-15-2005, 11:03 AM
  5. sieve of Erotosthenes (HELP!!!)
    By JFK in forum C Programming
    Replies: 1
    Last Post: 02-19-2002, 12:53 PM