Hello again everyone,
I have been promising myself that I would do some work on a library for large numbers since college. I read about them in a textbook by Binstock and Rex, but I didn't like the static nature of their implementation and I had a few other issues so I wrote my own.
Here is an excerpt from the main driver, test is currently set to 9999:
Code:
a = bigintread("9");
b = bigintread("9");
for(i = 0; i < test; i++, bigintcpy(c, b))
{
bigintmult(a, b, c);
}
printf( "------------------------\n" );
printf( "for( i = 0; i < test; i++ ) c = c*a\n");
bigintprint(stdout, a, 0);
printf( "\n" );
bigintprint(stdout, b, 0);
printf( "\n" );
bigintprint(stdout, c, 0);
printf( "\n" );
The program kept up with the system's graphical calculator (xcalc) until about 9^35 when the calculator produced an error message. The loop above produced 9^10000, it is in the file that I have attached. It has about the right number of digits 9494, so I'm rather confident with it.
Does anyone have that figure handy to verify it?
The program was running in the debugger for about seven hours before producing that figure, one factor of nine at a time lol. I think maybe it would be prudent to build a database for previously performed calculations. Then use a separate thread to find them if any while waiting for a calculation thread to finish, and using the first that came back successful of course.
The code for multiplication is here:
Code:
/*** bigintmult.c **********************************************
*
* Multiply two big integers a and b. Place the result into c.
*
*/
#include "bigint.h"
bigint *bigintmult( const bigint *a, const bigint *b, bigint *c )
{
bigint *r; // pointer to the current register (product)
bigint *swp; //
bigint *tmp;
short i;
short j;
short k;
short h;
int carry;
// verify/arrange for the register
if( NULL == c )
{
r = bigintcrt();
}
else
{
r = bigintzero(c);
}
// test that the operands are sane
if(not_a_number == a->s || not_a_number == b->s)
{
// NaN * b = NaN ^ a * NaN = NaN ^ NaN * NaN = NaN
r->s = not_a_number;
return r;
}
if(infinite_positive == a->s || infinite_negative == a->s)
{
// infinite * b = infinite
r->s = a->s;
return r;
}
if(infinite_positive == b->s || infinite_negative == b->s)
{
// a * infinite = infinite
r->s = b->s;
return r;
}
if(zero == a->s || zero == b->s)
{
return r;
}
// a != 0 ^ b != 0, a,b non-ideal integers
// clear and resize the register in case of overflow
tmp = bigintcrt();
bigintrealloc(tmp, 2*MAX(a->u, b->u) + 1);
swp = bigintcrt();
// perform multiplication
for(i = 0; i < b->mlen; ++i)
{
if(ZERO == b->m[i])
{
continue;
}
for(j = 0; j < a->mlen; j++)
{
tmp->m[i+j] = a->m[j]*b->m[i];
if(zero == tmp->s && ZERO != tmp->m[j+i])
{
tmp->s = positive;
}
}
// -- normalize tmp for addition to the result register --
for(k = 0; k < tmp->u*ARBITRARYMATH_ALLOCATION_UNIT; ++k)
{
if(9 < tmp->m[k])
{
carry = tmp->m[k]/10;
tmp->m[k] = tmp->m[k];
for(h = 1;
0 < carry
&& (k+h) < tmp->u*ARBITRARYMATH_ALLOCATION_UNIT;
++h, carry /= 10) // high_order_truncation will occur here
{
tmp->m[k+h] += (carry);
if(h+k+1 == tmp->u*ARBITRARYMATH_ALLOCATION_UNIT
&& (tmp->u+1)*ARBITRARYMATH_ALLOCATION_UNIT <= SHRT_MAX)
{
tmp->mlen = tmp->u*ARBITRARYMATH_ALLOCATION_UNIT;
bigintrealloc(tmp, tmp->u+1);
}
}
}
}
for(tmp->mlen = tmp->u*ARBITRARYMATH_ALLOCATION_UNIT;
0 < tmp->mlen; --tmp->mlen)
{
if(ZERO != tmp->m[tmp->mlen-1])
{
break;
}
}
bigintadd(r, tmp, swp);
bigintcpy(swp, r);
memset(tmp->m, ZERO, tmp->u*ARBITRARYMATH_ALLOCATION_UNIT*sizeof(bigdigit));
tmp->mlen = 1;
tmp->s = zero;
}
// select the correct sign
if( a->s != b->s )
{
r->s = negative;
}
else
{
r->s = positive;
}
// clean up
bigintdes(tmp);
bigintdes(swp);
bigintlength(r);
if(ceil(((float)r->mlen)/ARBITRARYMATH_ALLOCATION_UNIT) < r->u)
{
bigintrealloc(r, ceil(((float)r->mlen)/ARBITRARYMATH_ALLOCATION_UNIT));
}
return r;
}
Other alternative implementations also came to mind. Like using a queue rather than a table format for the number datastructure, but it made more sense to me that it grow a 'page' at a time rather than double in size as a queue for a stream might according to the suggestion in the algorithm analysis textbook. If a carry had more than an allocation unit of figures, the instrinsic data type would have already overflowed is my justification. What sort of advantages might I have missed from using a queue for representation?
Also, in order to wring out the last bit of speed from this Pentium 4, I think that the point of parallelism is at the intermediate products before summing. That doesn't help with the situation when multiplication occurs one unit at a time. Perhaps partitioning the figure by place value is more appropriate? Tips here would be great too. I had a great parallel algorithm analysis textbook while I was in college, but a friend indeed needed one.
Thank you for your time and effort and it is great to post with you again.
Best Regards,
New Ink -- Henry
Enclosure: cprogramming_attachment.txt