Thread: Is there a standard function?

  1. #166
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Alright, I had to do it...
    Code:
    int iMalcsFormat(long snum, char s[])
    {
        unsigned long num = snum;
        unsigned int len = 0;
        if (snum < 0) {
            s[len++] = '-';
            num = -snum;
        }
        if (num < 10) len += 1;
        else if (num < 100) len += 2;
        else if (num < 1000) len += 3;
        else if (num < 10000) len += 5;
        else if (num < 100000) len += 6;
        else if (num < 1000000) len += 7;
        else if (num < 10000000) len += 9;
        else if (num < 100000000) len += 10;
        else if (num < 1000000000) len += 11;
        else len += 13;
        unsigned int x = len;
        s[x] = '\0';
        do {
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = ',';
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = ',';
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = '0' + (char)(num % 10); num /= 10; if (!num) break;
            s[--x] = ',';
            s[--x] = '0' + (char)(num);
        } while (0);
        return len;
    }
    There's still the possibility of using multiplicative inverses for div/mod by ten for some of the above, but other than that probably the only faster option is hand-rolled assembly.
    Last edited by iMalc; 07-06-2010 at 11:05 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  2. #167
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    How about this?
    Code:
    void format_int(int num)
    {
        static int initialized = 0;
        
        char group[1000][4];
        
        if (!initialized) 
        {
            int i;
            for (i = 0; i < 1000; ++i)
            {
                group[i][0] = ',';
                group[i][1] = '0' + i / 100;
                group[i][2] = '0' + (i / 10) % 10;
                group[i][3] = '0' + i % 10;
            }       
            
            initialized = 1; 
        }
        
        char *pos = buffer + BUF_LEN - 4;
        
        while (num)
        {
            int this_group = num % 1000;
            num /= 1000;
            
            *((int *) pos) = *((int *) &group[this_group][0]);
            
            pos -= 4;
        }
        
        char *start = pos + 4;
        
        while ((*start) == '0' || (*start) == ',')
        {
            ++start;
        }
        
        size_t len = BUF_LEN - (start - buffer);
        
        memmove(buffer, start, len);
        
        buffer[len] = '\0';
    }
    This is the fastest I can think of. I'm hoping the compiler is smart enough to just do one div and use both the quotient and remainder.

    Uses 4KB RAM. Should fit in L1 cache.

    Oh and it assumes 32-bit ints. I believe it's endian safe, although it may not look like it.
    Last edited by cyberfish; 07-06-2010 at 09:32 PM.

  3. #168
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    I get 0.279s for 10000000 iterations of 1234567890.

    But of course, that's not a fair comparison, because my compiler is vastly superior .

  4. #169
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Ah! and there is a subtle bug in my code.

    It breaks the strict aliasing rule by accessing chars through an int pointer.

    I'm not sure how to fix that without incurring overhead.

    With gcc you'll need -fno-strict-aliasing.

    Not sure about others.

  5. #170
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Alright, I succummed to using gotos (unfortunately we can't really use duff's device for this). I blame this on the fact that I'm at home sick today with nothing better to do. Actually me using a goto already says that I'm sick.
    This version reduces the number of branches considerably (3 - 6 now instead of 3 - 19):
    Code:
    int iMalcsInsaneFormat(long snum, char s[])
    {
        char *ps = s;
        unsigned long num = snum, div;
        if (snum < 0) {
            *ps++ = '-';
            num = -snum;
        }
        if (num < 100000) {
            if (num < 10) goto L1;
            if (num < 100) goto L2;
            if (num < 1000) goto L3;
            if (num < 10000) goto L4;
            goto L5;
        } else {
            if (num < 1000000) goto L6;
            if (num < 10000000) goto L7;
            if (num < 100000000) goto L8;
            if (num < 1000000000) goto L9;
        }
        *ps++ = '0' + (char)(num / 1000000000); num %= 1000000000;
        *ps++ = ',';
    L9:
        *ps++ = '0' + (char)(num / 100000000); num %= 100000000;
    L8:
        *ps++ = '0' + (char)(num / 10000000); num %= 10000000;
    L7:
        *ps++ = '0' + (char)(num / 1000000); num %= 1000000;
        *ps++ = ',';
    L6:
        *ps++ = '0' + (char)(num / 100000); num %= 100000;
    L5:
        *ps++ = '0' + (char)(num / 10000); num %= 10000;
    L4:
        *ps++ = '0' + (char)(div = (num*8389UL)>>23); num -= div*1000;
        *ps++ = ',';
    L3:
        *ps++ = '0' + (char)(div = (num*5243UL)>>19); num -= div*100;
    L2:
        *ps++ = '0' + (char)(div = (num*6554UL)>>16); num -= div*10;
    L1:
        *ps++ = '0' + (char)(num);
        *ps = '\0';
        return ps - s;
    }
    Edit: Fixed! Okay, one last change... Now using some multiplicative inverses and I've actually tested the code this time!
    Last edited by iMalc; 07-06-2010 at 11:30 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  6. #171
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Now I see why the language doesn't have this as a standard function - there's too many ways to go about it. They couldn't decide on which one to use.

    We'll have to ask Frktons to catalog all these on his system, and post up the results. I expected something like what iMalc posted up, would be a very fast choice. Then I thought "bit fields would be even faster", but it seems surreal to write a bit field program for such a mundane task as putting comma's in a string of numbers.

    @Cyberfish, would you give us a description of how your program works, and what system you were timing this on?

    @iMalc - laughing as he succumbs to the Dark Side of the Force.
    Last edited by Adak; 07-07-2010 at 04:49 PM.

  7. #172
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by Adak View Post
    I'm not on a POSIX system. That is the MSVC (Windows) and Turbo C (DOS) version.
    respectively.

    You are the one who keeps posting these assertions, which you know are incorrect and/or irrelevant.
    You are blaming me for code that you used?
    I merely compiled your code, and I got that warning.
    I cannot be held responsible for your use of stoneage compilers and usage of non-standard compliant code.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #173
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Okay final version. Pulled out all the stops now... Only two divs or mods in the whole thing. I even got rid of two gotos! Time to retire this one:
    Code:
    int iMalcsUltimateFormat(long snum, char s[])
    {
        char *ps = s;
        unsigned long num1 = snum, num2, num3, div;
        if (snum < 0) {
            *ps++ = '-';
            num1 = -snum;
        }
        if (num1 < 10000) {
            if (num1 < 10) goto L1;
            if (num1 < 100) goto L2;
            if (num1 < 1000) goto L3;
        } else {
            num2 = num1 / 10000;
            num1 -= num2 * 10000;
            if (num2 < 10000) {
                if (num2 < 10) goto L5;
                if (num2 < 100) goto L6;
                if (num2 < 1000) goto L7;
            } else {
                num3 = num2 / 10000;
                num2 -= num3 * 10000;
                if (num3 >= 10) {
                    *ps++ = '0' + (char)(div = (num3*6554UL)>>16); num3 -= div*10;
                    *ps++ = ',';
                }
                *ps++ = '0' + (char)(num3);
            }
            *ps++ = '0' + (char)(div = (num2*8389UL)>>23); num2 -= div*1000;
    L7:
            *ps++ = '0' + (char)(div = (num2*5243UL)>>19); num2 -= div*100;
            *ps++ = ',';
    L6:
            *ps++ = '0' + (char)(div = (num2*6554UL)>>16); num2 -= div*10;
    L5:
            *ps++ = '0' + (char)(num2);
        }
        *ps++ = '0' + (char)(div = (num1*8389UL)>>23); num1 -= div*1000;
        *ps++ = ',';
    L3:
        *ps++ = '0' + (char)(div = (num1*5243UL)>>19); num1 -= div*100;
    L2:
        *ps++ = '0' + (char)(div = (num1*6554UL)>>16); num1 -= div*10;
    L1:
        *ps++ = '0' + (char)(num1);
        *ps = '\0';
        return ps - s;
    }
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  9. #174
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    @Cyberfish, would you give us a description of how your program works, and what system you were timing this on?
    I precompute substrings for all numbers from 0 to 999, including a leading comma.

    eg, group[123] = ",123"

    Then, I just build the string part by part.

    Conveniently, it's 4 bytes per part, so 32-bit wide integer copy works.

    That's pretty much it.

    Then I scan for the first non-zero and non-comma character. That would be the start of the actual number. Then I move it to the front.

  10. #175
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    @iMalc:

    A switch will work, too, no? Should be the same thing if it's compiled into a jump table.

    Code:
    int length;
    f (num1 < 10000) {
            if (num1 < 10) length = 2;
            if (num1 < 100) length = 3;
            if (num1 < 1000) length = 4;
    ..
    
    switch (length)

  11. #176
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by cyberfish View Post
    @iMalc:

    A switch will work, too, no? Should be the same thing if it's compiled into a jump table.

    Code:
    int length;
    f (num1 < 10000) {
            if (num1 < 10) length = 2;
            if (num1 < 100) length = 3;
            if (num1 < 1000) length = 4;
    ..
    
    switch (length)
    Yup, that's the way I considered doing it, then having a bunch of cases with no break between them.
    If there were a fast integer log10 function then it'd be sweet, but otherwise since the code was already going for speed over readability I got down and dirty with gotos.

    In case anyone wants more information on how I've performed divisions using multiply and shift. The theory is that instead of doing answer = x / y you are doing answer = ((x << n) / y) >> n (the shifts logically cancel out). Rewritting the shift as a multiply by a power of two answer = (x * 2^n / y) >> n and all I'm doing is pre-calculating 2^n / y. From there it's simply a matter of picking a large enough n whilst avoiding overflow, and making sure you round up.
    Take this (num2*5243)>>19 it performs a divide by 100 for any number up to 10000...

    To pick those magic numbers I first picked a power of two that was large enough that when divided by the divisor it was at least half as big as the number I want to support and that when multiplied by answer, doesn't exceed 2^32
    2^19 = 524288
    divide by divisor -> 524288 / 100 = 5242.88 -> 5243 (rounding up) (at least half as big as 10000 - check)
    multiplied by quotient -> 5243 * 10000 = 52430000 (less than 2^32 - check)

    I could instead have picked 2^20 giving (num2*10486)>>20
    2^20 = 1048576
    divide by divisor -> 1048576/ 100 = 10485.76 -> 10486 (rounding up) (at least half as big as 10000 - check)
    multiplied by quotient -> 10486 * 10000 = 104860000 (less than 2^32 - check)
    Last edited by iMalc; 07-07-2010 at 02:50 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  12. #177
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    There's supposed to be a log10 version here that takes ... maybe 20-40ish cycles?
    SynthMaker Forum &bull; View topic - fast log10
    I don't know how to use it, though, but if anyone else knows, it might be of help.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #178
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by Elysia View Post
    You are blaming me for code that you used?
    I merely compiled your code, and I got that warning.
    I cannot be held responsible for your use of stoneage compilers and usage of non-standard compliant code.
    No, I'm blaming you for whining about the fact that different compilers, have different functions included in them. What was your first clue on that marvelous discovery?

    If you don't want to run my programs/code, don't run them, but stop your whining.

  14. #179
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    How about you try writing standard compliant code instead? Then it will work regardless of what compiler you use. Clever, huh?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #180
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by cyberfish View Post
    I get 0.279s for 10000000 iterations of 1234567890.

    But of course, that's not a fair comparison, because my compiler is vastly superior .
    Hi cyberfish,
    in order to have some realistic performance tests, try to test your
    code with random generated numbers, so the compiler has to
    really find a way to otimize, without some nasty trick about using
    always the same number. :-)

    Actually your test is about 5 times faster than the one I used last.
    It is however faster than assembly tests somebody did, so it is quite
    improbable that it is just the better compiler, something else should be
    the reason.

    With random and different numbers for each formatting we'll have more
    accurate and reliable results.

    Let me know if these miracolous results are still there after this kind of test.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Getting an error with OpenGL: collect2: ld returned 1 exit status
    By Lorgon Jortle in forum C++ Programming
    Replies: 6
    Last Post: 05-08-2009, 08:18 PM
  2. Change this program so it uses function??
    By stormfront in forum C Programming
    Replies: 8
    Last Post: 11-01-2005, 08:55 AM
  3. const at the end of a sub routine?
    By Kleid-0 in forum C++ Programming
    Replies: 14
    Last Post: 10-23-2005, 06:44 PM
  4. c++ linking problem for x11
    By kron in forum Linux Programming
    Replies: 1
    Last Post: 11-19-2004, 10:18 AM
  5. Replies: 5
    Last Post: 02-08-2003, 07:42 PM