Thread: Prime numbers

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by abachler View Post
    This uses several optimizations, namely it skips all even numbers, which cant be prime (except 2), and uses a fast method for finding the Root of the prime.
    The code will tend to be slower than the Sieve of Eratosthenes for large primes (as it checks if a value is divisible by every odd value less than its square root). The benefit is that it uses less memory (the SoE relies on keeping track of all primes found for subsequent testing, and they have to be stored somewhere).

    Modifying the value of Root through iteration (as the code does) is not necessarily fast: tracking of the value of Root from one iteration to the next is the optimisation. In comparison with the looping over all odd values, the benefit of that optimisation is minor.

    There is the inconvenient fact that the code is also windows specific (relies on compiler extensions).

  2. #2
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    int64 is an ansi standard datatype. __int64 is the windows specific implimentation of that datatype. The code is only trivially windows dependant, i.e. change the datatype from __int64 to int64 and it is ansi standard.

    The code gains a huge performance increase in teh root optimization versus calculating teh root every iteration. Converting the relatively slow sqrt() to a simple divide and compare. As for the value of using SoE versus checking all odd numbers it depends on several factors, but mostly the speed of performign the calculations entirely in registers and the L1 cache versus making frequent calls into L2 and eventually uncached memory. Once the primes you are testing exceed available L2 memory for storage of the divisors, you will essentialyl be runnign at teh speed fo primary memory, which is usually orders of magnitude slower than my code, which will always run at synchronous speeds. Aside from that this code is designed to illustrate a solution not to be entered into the hall of fame for most optimized code EVAR.

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by abachler View Post
    int64 is an ansi standard datatype. __int64 is the windows specific implimentation of that datatype. The code is only trivially windows dependant, i.e. change the datatype from __int64 to int64 and it is ansi standard.

    The code gains a huge performance increase in teh root optimization versus calculating teh root every iteration. Converting the relatively slow sqrt() to a simple divide and compare. As for the value of using SoE versus checking all odd numbers it depends on several factors, but mostly the speed of performign the calculations entirely in registers and the L1 cache versus making frequent calls into L2 and eventually uncached memory. Once the primes you are testing exceed available L2 memory for storage of the divisors, you will essentialyl be runnign at teh speed fo primary memory, which is usually orders of magnitude slower than my code, which will always run at synchronous speeds. Aside from that this code is designed to illustrate a solution not to be entered into the hall of fame for most optimized code EVAR.
    However, the many of the non-primes are divisible by small numbers, so it will be only be on a few of them that you go all the way to the square root of the number.

    A 512KB cache will hold 128K primes. If we roughly say that primes are 300 per 1000 numbers, then there is about 360K primes that fits in the cache.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    there are approximately x/ln(x) prime numbers smaller than x. That is very innacurate at low values of X but gradually becomes more accurate and eventually underestimates by a small percentage.

    But as I stated, the code i posted was intended to illustrate a solution not as the most optimized solution. It is an EXAMPLE.

    If you want we can have a contest to see who can generate the fastest prime generator, Ill post it in teh contests channel.
    Last edited by abachler; 11-26-2008 at 02:37 PM.

  5. #5
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by abachler View Post
    int64 is an ansi standard datatype. __int64 is the windows specific implimentation of that datatype. The code is only trivially windows dependant, i.e. change the datatype from __int64 to int64 and it is ansi standard.
    Not true. int64 is not a standard type in either C or C++.
    Quote Originally Posted by abachler View Post
    The code gains a huge performance increase in teh root optimization versus calculating teh root every iteration. Converting the relatively slow sqrt() to a simple divide and compare.
    You have some measurements to back that up? Having a loop that does divide-and-compares on most iterations will fairly quickly exceed the time for computing a square root: division is also a relatively expensive operation. Of course, computing sqrt() has other tradeoffs (most notably that there is no guarantee that the double type will hold larger integral values) but that's another story - I wouldn't use sqrt() with the SoE either.
    Quote Originally Posted by abachler View Post
    As for the value of using SoE versus checking all odd numbers it depends on several factors, but mostly the speed of performign the calculations entirely in registers and the L1 cache versus making frequent calls into L2 and eventually uncached memory. Once the primes you are testing exceed available L2 memory for storage of the divisors, you will essentialyl be runnign at teh speed fo primary memory, which is usually orders of magnitude slower than my code, which will always run at synchronous speeds.
    I'm not saying you're wrong here, but I'd be interested in seeing measurements to back up your comments. Relying on L1 and L2 cache hits is highly system dependent, highly dependent on how your compiler and system optimises operations. However, increasing the rates of cache hits does not change the fact that your code inherently does more expensive operations (notably division) than alternate approaches like SoE. I'd be interested to know where the cross-over point is.

    Your comments on relative performance with SoE are easy to justify for small primes, but harder to justify for larger ones (as, when testing a large value, there are more divisors to be eliminated, and therefore more divisions to be performed).

  6. #6
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by grumpy View Post
    Not true. int64 is not a standard type in either C or C++.
    Quote Originally Posted by http://home.att.net/~jackklein/c/inttypes.html
    The Long Long Int Types (C Only)

    64 bit processors have been readily available in workstations for several years, and they will be coming to desktop computers soon. These processors can address enough memory to have arrays with more elements than a 32 bit long can specify. Inexpensive disk drives in use today can hold a file which contains more bytes than a 32 bit offset can reach. The requirement for an integer type required to hold more than 32 bits is obvious.

    The 1999 update to the ANSI/ISO C language standard has added a new integer type to C, one that is required to be at contain at least 64 bits. Included in this update are the new variable types signed and unsigned long long . The selected name comes from gcc and several other compilers which already provide this type as an extension. On 32 bit Windows compilers from Microsoft, Borland (and maybe others) this same extension has the name __int64.
    You have some measurements to back that up? Having a loop that does divide-and-compares on most iterations will fairly quickly exceed the time for computing a square root: division is also a relatively expensive operation. Of course, computing sqrt() has other tradeoffs (most notably that there is no guarantee that the double type will hold larger integral values) but that's another story - I wouldn't use sqrt() with the SoE either.
    it divide and compares once per loop versus sqrt() once per loop, therefore its a one to one comparison. sqrt() is slower than divide and compare.

    I'm not saying you're wrong here, but I'd be interested in seeing measurements to back up your comments. Relying on L1 and L2 cache hits is highly system dependent, highly dependent on how your compiler and system optimises operations. However, increasing the rates of cache hits does not change the fact that your code inherently does more expensive operations (notably division) than alternate approaches like SoE. I'd be interested to know where the cross-over point is.

    Your comments on relative performance with SoE are easy to justify for small primes, but harder to justify for larger ones (as, when testing a large value, there are more divisors to be eliminated, and therefore more divisions to be performed).
    By all means, post your SoE implimentation on the contest.

  7. #7
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    long long (in the 1999 C standard) is not required to be the same as a 64 bit int: in fact the standard goes out of its way to avoid mandating particular representations (eg integer types with a specified number of bits) of the "basic types" (int, long, long long, etc). There is no standard type named int64 or similar.
    Quote Originally Posted by abachler View Post
    it divide and compares once per loop versus sqrt() once per loop, therefore its a one to one comparison. sqrt() is slower than divide and compare.
    It is not necessarily slower than multiple divide and compares though, which your code can exhibit if there are multiple composite numbers between two consecutive primes. Hence my request that you back up your comments.
    Quote Originally Posted by abachler View Post
    By all means, post your SoE implimentation on the contest.
    I'm not the one who has made a claim of offering an optimised solution: you are. I am asking you to back up your assertions.

  8. #8
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by grumpy View Post
    It is not necessarily slower than multiple divide and compares though, which your code can exhibit if there are multiple composite numbers between two consecutive primes. Hence my request that you back up your comments.
    the divide and compare occurs no mor eoften than the sqrt(), if there are multiple consecutive composit numbers between 2 primes then the swrt() is ALSO executed multiple times, so again, you are wrong in thinking there are multiple div/comps per sqrt()

    I'm not the one who has made a claim of offering an optimised solution: you are. I am asking you to back up your assertions.
    You are claiming that the optimization I made for finding the root is slower than sqrt(). You are claiming that my optimized solution is not an optimization, or that you have a better optimization. In either case you are making a claim. Not to be blunt but put up or shut up
    Last edited by abachler; 11-26-2008 at 10:36 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Logical errors with seach function
    By Taka in forum C Programming
    Replies: 4
    Last Post: 09-18-2006, 05:20 AM
  2. Replies: 18
    Last Post: 11-04-2005, 02:41 PM
  3. prime numbers, counters, help!
    By vege^ in forum C++ Programming
    Replies: 1
    Last Post: 03-10-2003, 04:32 PM
  4. Prime numbers
    By Lazy Student in forum C Programming
    Replies: 12
    Last Post: 05-14-2002, 05:53 AM
  5. Homework help
    By Jigsaw in forum C++ Programming
    Replies: 2
    Last Post: 03-06-2002, 05:56 PM