1. ## Factorial

Here's a contest I think most people can give a try. The goal is to implement a program in C or C++ that compute precisely the factorial of N (N being an non-negative integer) as fast as possible.

For people who wouldn't know what's a factorial: http://en.wikipedia.org/wiki/Factorial
But I guess most of us have all seen somewhere as an introduction to recursion this kind of code
Code:
```unsigned int factorial(unsigned int N)
{
if (N == 0 || N == 1)
return 1;
return N * factorial(N - 1);
}```
Of course, this code when compiled on a "normal" machine will yields incorrect result if N is larger than 12. So this doesn't interest us.

The factorial to compute will be given as a command line argument.
The result has to be print to stdout. It can be either in decimal or in hexadecimal format.

The machine specifications are:
• Intel Core2 Duo (64-bit instruction set, SSE4.1 support, dual core, 6 MB of L2 cache)
• 4 GB of RAM

The target compiler is GCC 4.2.3 on a 64-bits Linux system (I'm using Ubuntu 8.04 64-bits). Since i'm quite new to Linux-based operating systems, don't make your code overly complicated to compile or you'll have to explain me how to get it works . A shell script (sh or bash) or a makefile would be ideal...

And here's some rules.
• You can't use mathematics libraries, except if you are the author or if it's the standard C/C++ math library. In fact, if someone want to use an "advanced" mathematic library, he can but his program won't be taken into account when it will be time to declare the winner of the contest.
• You can't use pre-calculated factorial, except for 1! and 0!. This is simply not interesting.

Note that every submission you submit (you can submit more than one if you want, I don't care) will be publicly available after, so everyone will be able to take a look at what people's did.

I encourage people with less experience to participate. That's just going to be more fun. And it's not that hard to make a working implementation.

Also, please put all your source code in a folder and include a README text file which would contains your cboard alias and directives on how to compile your solution. If you use libraries not easily found or not easily installable on a typical Linux distribution, gives indications where to get them. Also, specify in which base is the output. And zip the folder.

The contest ends in 3 weeks, which means June 15th, 22:00 GMT.

Have fun!
(And sorry for all the English mistakes i may have made)

2. Cool!

2 questions,
1. Did you want us to support any +ve value of N? Such as, I don't know -- 955858! ?
2. How do we actually submit our solutions? A PM? Email?

3. 1. Did you want us to support any +ve value of N? Such as, I don't know -- 955858! ?
Yes. The input number will be between 0 and 2^32 - 1. But since I don't want to leave my computer running for days for every program to terminate, consider it will be somewhere between 100 000 and maybe 1 000 000. But consider 2^32 - 1 being the "hard limit".

How do we actually submit our solutions? A PM? Email?
I would say by PM, but i did not even check if it was actually possible to send attachment via PM. I'll give an email address if it's not possible. EDIT: Ok, it doesn't look possible to send attachments via PM, so you'll have to send them via email. If you go see my profile, there's an email address you can use.

I'm glad at least one people is interested for now. I'll have to begin on working on my solution... one day... it won't be mind blistering but it will touch to one things or two i don't usually. Won't be the fastest, but shouldn't be the slowest neither.

4. Hmm... there's not many options I can think of to optimize a N factorial. There is only one mathematical approach I know of. But I'm not sure how many people can define it. I surely can't. Lookup tables is the usual way. Heck, even boost uses them for i <= max_factorial<T>::value.

Also, an 2^32-1 input will overflow on the result unless provisions are also made for big numbers, even for 64bit computers, won't it?

I think there's not much one can do, other than cutting on the end instruction set the best they can.

5. I think that's part of the problem

Anyone can calculate 1 to 12! Even the solution is given in the original post.

6. Well there are actually optimizations for large N - possibilities are given in the wikipedia article (or a link or two deep, I don't remember).

You can't use mathematics libraries, except if you are the author or if it's the standard C/C++ math library. In fact, if someone want to use an "advanced" mathematic library, he can but his program won't be taken into account when it will be time to declare the winner of the contest.
But we won't see them if we're not at least allowed to use a bignum library.

7. What I mean is that it's against the rules to use something like, example, GMP, except if you are one of the author.

Of course, if you want to be able to compute the factorial of a number larger than 12, you'll need some sort of data structure capable of manipulating "big integers". That's part of the challenge.

Is it clearer ?

8. > Is it clearer ?
Yes thanks

9. Originally Posted by Mario F.
Also, an 2^32-1 input will overflow on the result unless provisions are also made for big numbers, even for 64bit computers, won't it?
Then you have to be more clever than that, won't you?

10. Here's my submission. It is password protected, btw, so no need to cheat.

11. Alright, thanks for participating Sang-drax. Don't forget to publish the password the day the contest ends.

I would like to add that when I say it has to be in C/C++, inline assembly is also accepted (of course), as long as it's incorporated through a C/C++ program and compile on gcc and run on the target architecture. And for those who would like this extra information:

sizeof(int) = 4
sizeof(long) = 8

12. ok, so an array of ~21 million bytes shoudl suffice to hold the largest value you specified (1 million), although 2^32-1 would require around 205 gigabytes of ram just for teh answer, so Im going to go ahead and just assume you wont give an input larger than 1 million. At that rate, assuming you can get the multiply for free adn just have to transfer the answer each iteration, adn only once, you are lookign at around 80+ minutes run time for 1000000!

13. I wasn't joking when I said that there were optimizations for large N, obviously. These are algorithm changes too, not just unrolling loops or something like that.

14. My submission contains no optimizations whatsoever.

15. > My submission contains no optimizations whatsoever.
In that case,

Here is my submission:

Code:
`g++ -O3 -o mysubmission `find / -name fac.cpp``
Not sure what else I could optimize, because I actaully haven't read it

Just make sure you extract Sang-drax's first

On a serious note, It's coming!