:D never occurred to me. Thanks a bunch ;-) I'll get right on it.
Printable View
:D never occurred to me. Thanks a bunch ;-) I'll get right on it.
I'm still wondering why you are attempting prime number generation. Do you think this will make your algorithm faster?
The BEST prime number generators (and I'm talking segmented wheel sieve algorithms that are fairly complex to understand and implement) operate in O(n / (log log n)). Now consider that you are working with long long's. Your maximum n is 2^64 - 1. log log 2^64 = 6. So you're looking at a running time of (n/6). If you don't use a leading edge primes generator and you end up with something that runs in linear time or superlinear time, well I wouldn't even consider attempting to generate primes.
The factoring portion of the algorithm runs in approximately O(sqrt(n)). This is much better than linear. The overhead of prime number generation just isn't necessary.
Also, pulling 2 out of the factoring loop was a wise choice. Its an easy and simple way to increase your running time by a factor of 2. However, you aren't taking advantage of it. Once you factor out 2, you can start your factoring loop at 3 and increase the loop counter by 2 each iteration.
There isn't a huge advantage to pulling 3 out of the loop however, and I would suggest not doing it. While pulling 2 out gave you a 50% reduction in running time, pulling 3 out would only give you a 33% reduction, but its actually worse than that, since half of the numbers divisible by 3 are also divisible by 2. It would only provide somewhere around a 16% reduction in execution time (an overall reduction of 66%). I don't think this is worth cluttering up the code. Its going to make your code longer, plus it will require weird adjustments to your loop counter to take advantage of it.
I'm sorry arpsmack, but my understanding of mathematics is way bellow the point of what is needed to understand the first three paragraphs of what you wrote there :D After all, I just started High School half a year ago :D
I understood the second part, though. I pulled two and three out of the loop because the way I'm generating the prime numbers there is using the 6k+1/6k-1 rule. That doesn't apply to the first two primes, that is two and three.
Anyway, I finally finished. This version works fine, except it takes way to long if the prime numbers are high. For example it would take forever to factorize 11 111 111 111, because that is 21649 x 513239. It finds the first number almost immediately, but it takes forever to get up to five hundred thousand. 111 111 111 111 is a matter of seconds however ;-) I'm going to try and find a way around that, but I doubt I will. Still, a LOT better than the previous version, where it couldn't go above 32739 :D :D Here's the source. Thanks for your help guys !!
EDIT: Okay, ignore what I just wrote about the numbers. I forgot to add a square root, and that speeds it up a LOT. :D :D
The thing is, once you know 21649 is a factor, divide by 21649. Your task would now be to factorise 513239. But 513239 is greater than the square root of 11111111111, so you know you can stop the computations immediately. Hence, 111111111111 = 21649 x 513239.Quote:
For example it would take forever to factorize 11 111 111 111, because that is 21649 x 513239. It finds the first number almost immediately, but it takes forever to get up to five hundred thousand.
I know. That's what I do. See the edit and check the source ;-) But thanks :-)
I wrote my own version, it factorizes 11 111 111 111 into it's two components so quickly that it doesn't show any time when measuring using "clock()", which on my machine uses milliseconds.
Entering 1111111111111111111 takes about 28 seconds to find that it's a prime. That's another 4 ones on top of the number you posted.
I had a look at your code after I wrote mine, and on seeing your solution changed my code to print "n ^ m" instead of "n * " m times. However, I can't understand your algorithm at all - it seems very strange. And trying even trivial cases, your code doesn't seem to produce the expected output.
--
Mats
Is it possible that you tried this on Windows using MinGW or linking to a Microsoft runtime that doesn't support the format specifier "%llu" but uses "%I64u" instead? I made this same mistake actually. I recompiled the code with "%I64" and it worked fine.Quote:
Originally Posted by matsp
The basic gist of what I posted previously was: don't generate primes. It won't make your program any faster and only adds complexity and potential for error. A simple loop starting at 2 (or three if you pull 2 out of the loop) going up to the sqrt(n) will be much faster.
In the case matsp posted, my program factors (or actually determines the primality of) 1111111111111111111 in approximately 13 seconds. However, I pulled the 2 out of the loop, effectively doubling the speed. THAT is an effective optimization, generating primes is not. I stopped your (G4B3's) program after a minute of trying to factor that same number.
I was actually using Microsofts compiler, but yes, I didn't allow for the formatting difference.
Now fixed, and the output for trivial cases is now OK.
Your processor must be a fair bit faster than mine - but I have an old Athlon64, so no surprise there. [Or your compiler generates better code than mine]. Actually, in release mode, my code takes 25 seconds for the "lots of ones" prime.
This number 111 111 111 111 111 111 takes about 0.047 seconds to factorize into:
2071723 * 5363222357
--
Mats
That's interesting. I pulled out a couple tricks to get 1111111111111111111 (19 1's) down to 8.36 seconds; however 11111111111111111 (17 1's) takes .72 seconds to factor. That's quite a bit slower for me.
I can't see any particular reason why one number would be faster on my computer, while another is faster on yours... unless you're starting your loop at sqrt(n) and working your way down or something.
Okay, I decided to test G4B3's program and honestly it took just a few seconds to factor 11 111 111 111 on my computer, which runs a mere Pentium D 930 :)
Incidentally, I reason that all 64 bit unsigned integers have no more than 15 distinct prime factors, since the product of the first 15 prime numbers can fit into a 64 bit unsigned integer, but the product of the first 16 prime numbers cannot. Therefore, we can store the prime factors in an array of 15 unsigned integers, then have a nice formatting by sorting the array and then printing the factors in ascending order.
Anyway, I decided to implement the algorithm I outlined, with a few changes:
1. Instead of comparing i with sqrt_n, I compare i * i with n, in view that every time a factor is found, the problem is reduced to factoring n / factor instead of n.
2. 2 as a factor is handled separately, as per the optimisation mentioned by arpsmack.
3. Instead of two separate lists, I combined the factor value and count into a struct and have a single factor list.
Care to share your "tricks"? The 19 1s takes about 35 seconds to be shown as prime on my computer with my implementation, though the 18 1s are factored just about instantly... unfortunately I am not sure what is the most accurate way to measure sub-second timings here.Quote:
That's interesting. I pulled out a couple tricks to get 1111111111111111111 (19 1's) down to 8.36 seconds; however 11111111111111111 (17 1's) takes .72 seconds to factor. That's quite a bit slower for me.
@arpsmack; Why would the program be much faster if I completely removed the Prime Number thing? The way I see it (and I know I'm wrong, I just don't know how wrong), you divide by two, effectively reducing it to an odd number. Then you divide by three, you skip four (or, to be more precise, the program tries four and comes up with modulus !=0) etc. etc. However, if I just divide by primes, it skips four, six, eight, and all semi-primes right off...although it does get extremely slow when dealing with large numbers, since the inner "l" loop has to catch up with the outer "k" loop...
@laserlight: Once I added the square root, that is the inner "l" loop will only run while "l" is smaller than "sqrt(k)", it sped up the program a lot. Before, when I didn't have the square root in place, it took around a minute to get the primes even up to a hundred thousand. However, when I implemented the square root, it was just a matter of about two seconds for it to get up to five hundred thousand.
Could all of you post your source, or at least a basic gist of how it works? I'd like to have a look. I know laserlight already posted the basic workings of his, but I got lost when reading the above...maybe I'd understand better if I saw the source itself. Curiosity has no ends :-)
Actually, the point of treating 2 as a special case is that you test if the number is divisible by 2. If it is, then you record 2 as a factor, then keep on dividing the result by 2 until the result is odd, while recording the number of times you divided. Then you begin on the general case, starting from 3, and skipping all even numbers, including 4.Quote:
The way I see it (and I know I'm wrong, I just don't know how wrong), you divide by two, effectively reducing it to an odd number. Then you divide by three, you skip four (or, to be more precise, the program tries four and comes up with modulus !=0) etc. etc.
Sure, you skip all the composite numbers, but you waste time finding the primes in the first place.Quote:
Why would the program be much faster if I completely removed the Prime Number thing? [...] However, if I just divide by primes, it skips four, six, eight, and all semi-primes right off...although it does get extremely slow when dealing with large numbers, since the inner "l" loop has to catch up with the outer "k" loop...
I took a hint from the sieve of atkins where Mr. Atkins noticed that every 30 primes can actually be encoded into 8 bits, since only 8 of them have a chance of being prime anyway. This works because 30 = 2 * 3 * 5, so you can strike out all the multiples of 2, 3, and 5 from that point on. (You could also add 7 in there and up it to 210 for slightly better results).Quote:
Care to share your "tricks"?
First I created a hardcoded array of the first 30 primes, and an array of boolean values starting at 30, and struck out all the values that couldn't possibly be prime:
Then I converted this table to an array of gaps, distances between the values that could possibly be prime (the first value assumes you're starting at a multiple of 31):Code:unsigned char p30[] = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 };
unsigned char atkins[] = {
0, 1, 0, 0, 0,
0, 0, 1, 0, 0,
0, 1, 0, 1, 0,
0, 0, 1, 0, 1,
0, 0, 0, 1, 0,
0, 0, 0, 0, 1
};
And now instead of just pulling 2 out of the factoring loop, I pull the first 30 primes out and process them separately, and then start my factoring loop at 31 incrementing the counter by the values from the gaps array, effectively skipping all multiples of 2, 3, and 5.Code:unsigned char gaps[] = {
6, 4, 2, 4, 2, 4, 6, 2
};
Furthermore, unroll the factoring loop and remove the gaps array, and you'll have what I basically wrote at this point. On my processor (AMD64 X2 4000+ Brisbane) this determines primality of 19 1's in around 7.95 seconds (it varies each run). I'm not very good at language optimization though. I've tried various things - unrolling loops, turning functions into macros, etc... but each time (to my surprise) it actually made the program slower.
I also tried compiling with Visual Studio 2008 instead of MinGW, and to my surprise it was actually worse. In this case execution speeds are worse by about .4 - 1.4 seconds depending on the input.
I have a feeling that if it takes 35 seconds to run on your processor even after pulling 2 out of the loop, then I must just have a faster processor.
I used the clock() function for my timings by the way. I'll post my code as well, but I need to do some major cleanup on it. I have all sorts of macro's and commented code lying around from trying different tweaks, its quite nasty looking.
I had some problems with the MinGW port of gcc 3.4.5 that might be due to poor 64-bit long long support on a 32-bit machine, but MSVC8 (Visual Studio 2005) worked. I switched to my Ubuntu Linux 7.04 installation, compiled with gcc 4.1.3, and my implementation found that 19 1s was prime in 20 seconds (that's 15 seconds faster on the same computer with optimisations in both cases, unless I screwed up for MSVC8 optimisation settings).Quote:
I also tried compiling with Visual Studio 2008 instead of MinGW, and to my surprise it was actually worse. In this case execution speeds are worse by about .4 - 1.4 seconds depending on the input.
I have a feeling that if it takes 35 seconds to run on your processor even after pulling 2 out of the loop, then I must just have a faster processor.
Ok, here's the code I've got. I tried to rewrite it a bit and clean up the mess. Unfortunately, I still think it looks a little messy, but I guess I don't really care that much.
That was probably the reason why matsp's version factored 17 1's so much faster than mine. I changed it to use i * i instead of sqrt(n) and it factors the large numbers much quicker now.Quote:
Originally Posted by laserlight
Additionally, I realized that taking the sqrt() of a long long probably wasn't the best idea, since the parameter probably gets cast to a double and can't store enough precision for all 64 bit integers.