# Multiplying and dividing algorithm

This is a discussion on Multiplying and dividing algorithm within the C Programming forums, part of the General Programming Boards category; Hello I need to do the following calculation on a 32 bit embedded system: C = 10000 * A / ...

1. ## Multiplying and dividing algorithm

Hello

I need to do the following calculation on a 32 bit embedded system:

C = 10000 * A / B

The values A and B are signed, and can be -10000000 .. 10000000.

I need the calculation to be as accurate as possible. The value B will always be higher than the value of A, and therefore it shouldn't be a problem storing the result in the value C.

I have tried making the devision, and then calculating the remainder seperatly, and then adding it afterward, but for some values of A and B the accuracy just got worse.

The calculation has to be done rather often, so it has to be optimized for speed.

Any suggestions?

2. What processor and what compiler?

--
Mats

3. H8 processor and GCC compiler.

4. As given:
Code:
C = 10000 * A / B
or
Code:
C = (10000 * A) / B
In other words, A is made a lot larger before you divide it by B. The other way around
Code:
C = 10000 * (A / B)
might have a lower chance of overflow. And for all I know it might be more efficient. But of course, for integral division you'd get much lower accuracy.

I don't think you're going to get much faster than the algorithm that you already have. You could write it in assembler or something, or have a lookup table if one of the values is often constant or predictable. But a lookup table would take up more memory. It's like the time-space continuum article that I read somewhere . . . you could make it a little faster, but memory usage or something else would suffer.

5. The problem is, that C = (10000 * A) / B will give overflow, and that C = (A / B) / * 10000 is not accurate enough.

I don't think a lookup table will be a solution, since I can't really predict any of the values.

Is it somehow possible to split the variables into 2 signed longs each, calculate them seperately, and then merge them back into a single signed long, after the calculation is done, and the value is small enough to be stored in a single signed long again?

The accuracy is more important than the execution speed, but still it must be as fast as possible.

6. Is it somehow possible to split the variables into 2 signed longs each, calculate them seperately, and then merge them back into a single signed long, after the calculation is done, and the value is small enough to be stored in a single signed long again?
Sure, though I can't see it being any faster . . .
Code:
type A, B;

/* ... */

signed long Ah = A >> 32, long Al = A & 0xffffff;
signed long Bh = B >> 32, long Bl = B & 0xffffff;
type C;

C = (type)(calculation);
You could use INT_MAX or whatever (from limits.h) rather than 0xffffff. I might have gotten the number of f's wrong . . .

The problem is, that C = (10000 * A) / B will give overflow, and that C = (A / B) / * 10000 is not accurate enough.
Here's what I might suggest.
Code:
C = (int)(10000 * ((double)A / B) + 0.5);
That way you get accurate values, including rounding, and no chance of overflow. However, it would be significantly slower since it involves floating-point calculations.

 BTW, what type are A, B, and C? [/edit]

7. > C = 10000 * (A / B)
B is larger than A, so this ends up at 0.

There are a few issues to work around.
+/-10000000 * 10000 (the worst case) needs 38 bits of precision to store the complete intermediate result. If we are to avoid this, then we need to do something else.

Values of A up to and including 214747 don't need any special treatment, the intermediate result does not overflow, and can safely be divided by B without any additional effort.

The next step is to extract factors of 10000 (eg, f=200 and g=50), and rewrite the expression as
C = f * A / (B/g)
The key being for an upper limit of A, choose a value of f which doesn't result in arithmetic overflow.

Fortunately, we only need to make up 6 bits worth of overflow, so we can probably get away with a few simple if/else tests on some large values of A, and choose appropriate f,g pairs which preserve as much of the intermediate result as possible.

Code:
if ( A <= 214747 ) {
C = 10000 * A / B;
} else
if ( A <= 858993 ) {  // A*10000/4 <= LONG_MAX
// f=2500, g=4
C = 2500 * A / ( B / 4 );
} // and so on

8. Try to see if the compiler supports MulDiv(), which does EXACTLY the operation you are asking for, and is generally available in some optimized form.

Salem's suggestion is also very good. And the beuty if you use multiples of 2, 4, 8, 16, etc is that the divide operation on both sides of the actual divide operaiton can be done with shift operations, which is usually A LOT faster than divide.

The other possibility, along with what Salem suggest, would be to use int when possible, and double (or float) when it's not possible. Since the need for precision is (usually) needed when the input numbers are small, rather than when they are large, this should work fairly well.

--
Mats

9. I seriously doubt GCC on an embedded system would support the Windows function MulDiv().

And the be[a]uty if you use multiples of 2, 4, 8, 16, etc is that the divide operation on both sides of the actual divide operaiton can be done with shift operations, which is usually A LOT faster than divide.
I imagine that GCC would be able to perform this optimisation, if optimisations are enabled.

10. Originally Posted by jokkemokke
The problem is, that C = (10000 * A) / B will give overflow, and that C = (A / B) / * 10000 is not accurate enough.

Code:
C = 10 * (1000 * A / B)
or...
C = 100 * (100 * A / B)
or...
C = 1000 * (10 * A / B)
or even...
C = 250 * (40 * A / B)
The higher the factor in the parentheses, the more accurate, and the more prone to overflow. You just have to make sure the two fixed factors multiply to 10000.

11. Originally Posted by dwks
I seriously doubt GCC on an embedded system would support the Windows function MulDiv().
When I saw MulDiv in some code the other day, I searched on the Net and got several non-MS hits - SGI for example.

I imagine that GCC would be able to perform this optimisation, if optimisations are enabled.
Yes, that's the idea. My point was more that the divide by "nice" numbers is much better than dividing by arbitrary numbers.

I suspect that the H8 isn't the "hottest" processor when it comes to math, so it may be a good idea to try to avoid divides when possible. [I know a 68000 takes something like 100 clock-cycles to divide a number, and there are many other processors where divide operations aren't even built into the processor - you have to do it "by hand"].

--
Mats

12. Even if the available library doesn't support a muldiv() of some sort, it the processor does support a 64-bit intermediate result for mul, which can be used by div, then crafting your own little assembler function would be worthwhile.

13. Originally Posted by jokkemokke
H8 processor and GCC compiler.
I don't suppose you could be more specific? (This caught my attention.)

H8SX Family

H8SX family offers a 32bit CISC microcomputer with improved performance and an internal Multiplier/Divider with a maximum operating frequency of 50 MHz.