Yikes indeed.
I suggest taking a look at the bigint class on my website (link is in my sig; See the useful classes page). It works in basically the same way 64bit math operations are emulated on a 32-bit processor, using two-complement numbers of arbitrary size. Memory layout is identical to what you would expect from a native type of the specified size.
It even has a factorial function built-in!
Darryl, you might want to check it out too. I see you used Duff's Device in yours, btw.