Thread: Multiplying and dividing algorithm

  1. #1
    Registered User
    Join Date
    Sep 2007
    Posts
    3

    Multiplying and dividing algorithm

    Hello

    I need to do the following calculation on a 32 bit embedded system:

    C = 10000 * A / B

    The values A and B are signed, and can be -10000000 .. 10000000.

    I need the calculation to be as accurate as possible. The value B will always be higher than the value of A, and therefore it shouldn't be a problem storing the result in the value C.

    I have tried making the devision, and then calculating the remainder seperatly, and then adding it afterward, but for some values of A and B the accuracy just got worse.

    The calculation has to be done rather often, so it has to be optimized for speed.

    Any suggestions?

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    What processor and what compiler?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    Registered User
    Join Date
    Sep 2007
    Posts
    3
    H8 processor and GCC compiler.

  4. #4
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    As given:
    Code:
    C = 10000 * A / B
    or
    Code:
    C = (10000 * A) / B
    In other words, A is made a lot larger before you divide it by B. The other way around
    Code:
    C = 10000 * (A / B)
    might have a lower chance of overflow. And for all I know it might be more efficient. But of course, for integral division you'd get much lower accuracy.

    I don't think you're going to get much faster than the algorithm that you already have. You could write it in assembler or something, or have a lookup table if one of the values is often constant or predictable. But a lookup table would take up more memory. It's like the time-space continuum article that I read somewhere . . . you could make it a little faster, but memory usage or something else would suffer.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  5. #5
    Registered User
    Join Date
    Sep 2007
    Posts
    3
    The problem is, that C = (10000 * A) / B will give overflow, and that C = (A / B) / * 10000 is not accurate enough.

    I don't think a lookup table will be a solution, since I can't really predict any of the values.

    Is it somehow possible to split the variables into 2 signed longs each, calculate them seperately, and then merge them back into a single signed long, after the calculation is done, and the value is small enough to be stored in a single signed long again?

    The accuracy is more important than the execution speed, but still it must be as fast as possible.

  6. #6
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Is it somehow possible to split the variables into 2 signed longs each, calculate them seperately, and then merge them back into a single signed long, after the calculation is done, and the value is small enough to be stored in a single signed long again?
    Sure, though I can't see it being any faster . . .
    Code:
    type A, B;
    
    /* ... */
    
    signed long Ah = A >> 32, long Al = A & 0xffffff;
    signed long Bh = B >> 32, long Bl = B & 0xffffff;
    type C;
    
    C = (type)(calculation);
    You could use INT_MAX or whatever (from limits.h) rather than 0xffffff. I might have gotten the number of f's wrong . . .

    The problem is, that C = (10000 * A) / B will give overflow, and that C = (A / B) / * 10000 is not accurate enough.
    Here's what I might suggest.
    Code:
    C = (int)(10000 * ((double)A / B) + 0.5);
    That way you get accurate values, including rounding, and no chance of overflow. However, it would be significantly slower since it involves floating-point calculations.

    [edit] BTW, what type are A, B, and C? [/edit]
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > C = 10000 * (A / B)
    B is larger than A, so this ends up at 0.

    There are a few issues to work around.
    +/-10000000 * 10000 (the worst case) needs 38 bits of precision to store the complete intermediate result. If we are to avoid this, then we need to do something else.

    Values of A up to and including 214747 don't need any special treatment, the intermediate result does not overflow, and can safely be divided by B without any additional effort.

    The next step is to extract factors of 10000 (eg, f=200 and g=50), and rewrite the expression as
    C = f * A / (B/g)
    The key being for an upper limit of A, choose a value of f which doesn't result in arithmetic overflow.

    Fortunately, we only need to make up 6 bits worth of overflow, so we can probably get away with a few simple if/else tests on some large values of A, and choose appropriate f,g pairs which preserve as much of the intermediate result as possible.

    Code:
    if ( A <= 214747 ) {
      C = 10000 * A / B;
    } else
    if ( A <= 858993 ) {  // A*10000/4 <= LONG_MAX
      // f=2500, g=4
      C = 2500 * A / ( B / 4 );
    } // and so on
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Try to see if the compiler supports MulDiv(), which does EXACTLY the operation you are asking for, and is generally available in some optimized form.

    Salem's suggestion is also very good. And the beuty if you use multiples of 2, 4, 8, 16, etc is that the divide operation on both sides of the actual divide operaiton can be done with shift operations, which is usually A LOT faster than divide.

    The other possibility, along with what Salem suggest, would be to use int when possible, and double (or float) when it's not possible. Since the need for precision is (usually) needed when the input numbers are small, rather than when they are large, this should work fairly well.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  9. #9
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    I seriously doubt GCC on an embedded system would support the Windows function MulDiv().

    And the be[a]uty if you use multiples of 2, 4, 8, 16, etc is that the divide operation on both sides of the actual divide operaiton can be done with shift operations, which is usually A LOT faster than divide.
    I imagine that GCC would be able to perform this optimisation, if optimisations are enabled.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  10. #10
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by jokkemokke View Post
    The problem is, that C = (10000 * A) / B will give overflow, and that C = (A / B) / * 10000 is not accurate enough.
    Trade accuracy for overflow resistance:

    Code:
    C = 10 * (1000 * A / B)
    or...
    C = 100 * (100 * A / B)
    or...
    C = 1000 * (10 * A / B)
    or even...
    C = 250 * (40 * A / B)
    The higher the factor in the parentheses, the more accurate, and the more prone to overflow. You just have to make sure the two fixed factors multiply to 10000.

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by dwks View Post
    I seriously doubt GCC on an embedded system would support the Windows function MulDiv().
    When I saw MulDiv in some code the other day, I searched on the Net and got several non-MS hits - SGI for example.

    I imagine that GCC would be able to perform this optimisation, if optimisations are enabled.
    Yes, that's the idea. My point was more that the divide by "nice" numbers is much better than dividing by arbitrary numbers.

    I suspect that the H8 isn't the "hottest" processor when it comes to math, so it may be a good idea to try to avoid divides when possible. [I know a 68000 takes something like 100 clock-cycles to divide a number, and there are many other processors where divide operations aren't even built into the processor - you have to do it "by hand"].

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  12. #12
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Even if the available library doesn't support a muldiv() of some sort, it the processor does support a 64-bit intermediate result for mul, which can be used by div, then crafting your own little assembler function would be worthwhile.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  13. #13
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by jokkemokke View Post
    H8 processor and GCC compiler.
    I don't suppose you could be more specific? (This caught my attention.)

    H8SX Family

    H8SX family offers a 32bit CISC microcomputer with improved performance and an internal Multiplier/Divider with a maximum operating frequency of 50 MHz.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

Popular pages Recent additions subscribe to a feed