Thread: Testpad for itoa()

  1. #61
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Back from a computerless week-end, I feel like to have some:

    Digression Time

    Before starting my considerations, I'd like to thank all the guys who
    are contributing to this thread with code, suggestions, links, whatever.
    I'm not going, for the time being, to reply your last posts, but I'll
    do, sooner or later.

    As we know, when we use standard C functions we deal with an
    abstract machine, and this very fact gives us the degree of portability
    we can need if we code for cross-platform software systems.

    In some occasions we code just for fun, like myself, or code for
    a single platform [Mac, Linux, Windows, whatever] and the degree of
    portability of our code is not our first concern.

    So far we have seen that a very simple task like putting a thousand separator in a
    number, for readability purpose, gives us a few choices and quite a lot of thinking.

    But when I look at what my Win7 OS does when I ask it to display
    some info about my testpad.c guess what I get?
    Code:
    C:\>dir *.c
     Il volume nell'unità C è WIN7-64
     Numero di serie del volume: 24E7-E258
    
     Directory di C:\
    
    09/07/2010  22:00             5.493 testpad.c
                   1 File          5.493 byte
                   0 Directory  26.987.839.488 byte disponibili
    
    C:\>
    As you can see my OS speaks italian and displays DATE and numbers
    well formatted and localized. The same happens with the GUI properties
    dialog box.

    I am not sure about that, but I guess any modern OS does the same.

    From a didactical point of view I need to know how I can solve the problem
    using standard C functions but, at the same time, I need to know my
    real machine as well. So while I have intention to carry on this experiment
    of building a standard function for the given task, in a few weeks I'm going
    to move towards the real machine I'm using, and have a look at the APIs
    already doing what I need, and [why not] if this is more or less efficient than
    the functions we built so far.

    But the very next step is going to be:
    The slowest but still reliable standard C99 formatting function you can dream about


    A last consideration: I think a good name for the function we are building
    should be itoas() = integer to ASCII with thousand separator.

    Attached the slowest version I could manage to code and the output:

    Code:
                        Testing version : frk_slow
                     ------------------------------
     Testing on Intel Core 2 Duo E6600 2.4 Ghz // IA32
     OS = Windows 7 Ultimate 64 bit  --  Compiler = Pelles C 6.00.4
    
     Elapsed time: 4.883 ms to format 10.000.000 random numbers
     ----------------------------------------------------------
    
     handling 0 ---> 0
     handling 12 ---> 12
     handling -12 ---> -12
     handling 256 ---> 256
     handling -256 ---> -256
     handling 1000 --> 1.000
     handling -1000 --> -1.000
     handling 1000000 ---> 1.000.000
     handling -1000000 ---> -1.000.000
     handling 1000000000 ---> 1.000.000.000
     handling -1000000000 ---> -1.000.000.000
     handling 2147483647 -> 2.147.483.647
     handling -2147483648 -> -2.147.483.648
    And here we have the whole integer range performed with iMalc's itoas():

    Code:
                        Testing version : iMalc
                     ------------------------------
     Testing on Intel Core 2 Duo E6600 2.4 Ghz // IA32
     OS = Windows 7 Ultimate 64 bit  --  Compiler = Pelles C 6.00.4
    
     Testing the numbers from: -2147483648  to  2147483647
    
     The formatting process has taken:  105,818 ms  to perform the whole cycle
    Now we need a second version, fast enough, and already pre-tested
    to compare the whole integer range generated strings to confirm they
    are doing the same thing.

    If I correctly remember the only pre-tested and fast enough second
    itoas() should be Elysia's version. The next test should take some 9-10'.
    Let me know if there is a faster pre-tested version that I don't remember.
    Last edited by frktons; 07-12-2010 at 03:15 PM.

  2. #62
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    The second function to use for testing purposes against iMalc's one
    gives these results on 10 million random numbers:
    Code:
                        Testing version : Elysia
                     ------------------------------
     Testing on Intel Core 2 Duo E6600 2.4 Ghz IA32
     OS = Windows 7 Ultimate 64 bit  --  Compiler = Pelles C 6.00.4
    
     Elapsed time: 1.436 ms to  format 10.000.000 random numbers
     ------------------------------------------------------------
    
     handling 0 ---> 0
     handling 12 ---> 12
     handling 256 ---> 256
     handling 1000 --> 1.000
     handling 1000000 ---> 1.000.000
     handling 1000000000 ---> 1.000.000.000
     handling 2147483647 -> 2.147.483.647
     handling -2147483648 -> -2.147.483.648
    So it'll take about 8-9 minutes to perform the whole range of
    integers formatting, plus the more or less 2 minutes for iMalc's one
    and the time for comparing each string produced by the two.

    We are going to total 12 minutes, and still we don't know if all the
    formatted strings are correct.

    It looks like creating a standard new function takes a lot of time.
    And we didn't consider the options of locale.h so far.
    Last edited by frktons; 07-13-2010 at 12:00 AM.

  3. #63
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    locale.h was mentioned before somewhere. I mentioned it earlier in another thread it seems like that post got lost. It's possible to do this with locale.h, printf and setting the thousand_sep to "." if that is what you prefer.

    Im my opinion it makes more sense to format only when you print, no need to deal with strings and the numbers can be stored in their internal representation as numbers which takes much less space. It's seems to me like the time it takes to get a number to screen is the least to worry about.
    Last edited by Subsonics; 07-13-2010 at 02:01 AM.

  4. #64
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Subsonics View Post
    locale.h was mentioned before somewhere. I mentioned it earlier in another thread it seems like that post got lost. It's possible to do this with locale.h, printf and setting the thousand_sep to "." if that is what you prefer.

    Im my opinion it makes more sense to format only when you print, no need to deal with strings and the numbers can be stored in their internal representation as numbers which takes much less space. It's seems to me like the time it takes to get a number to screen is the least to worry about.
    Unfortunately locale.h doesn't do anything more than changing the
    decimal point with a comma for european countries that use a different
    thousand separator.

  5. #65
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    Normally the thousand separator is a space, at least on my machine. But you can set it to what ever you want.

    Code:
    #include <stdio.h>
    #include <locale.h>
    
    int main()
    {
            setlocale(LC_NUMERIC, "");
            struct lconv *locale = localeconv();
            locale->thousands_sep = ".";
    
            int value = 1234567899;
    
            printf("%'d\n", value);
    
    
            return 0;
    }

    This prints: 1.234.567.899
    Last edited by Subsonics; 07-13-2010 at 02:16 AM.

  6. #66
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Subsonics View Post
    Normally the thousand separator is a space, at least on my machine. But you can set it to what ever you want.

    Code:
    #include <stdio.h>
    #include <locale.h>
    
    int main()
    {
            setlocale(LC_NUMERIC, "");
            struct lconv *locale = localeconv();
            locale->thousands_sep = ".";
    
            int value = 1234567899;
    
            printf("%'d\n", value);
    
    
            return 0;
    }

    This prints: 1.234.567.899
    And the output is:
    Code:
    'd
    Press any key to continue...
    Is this what you meant?

  7. #67
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    No, that is not what I meant. It prints 1.234.567.899 here. This was posted earlier to get the currency symbol as well, without ' you don't get the separation. I'm using Unix so this is the locale.h I'm using.

    http://www.opengroup.org/onlinepubs/.../locale.h.html

    Have you tried it, or are you assuming based on the code?
    Last edited by Subsonics; 07-13-2010 at 02:26 AM.

  8. #68
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Subsonics View Post
    No, that is not what I meant. It prints 1.234.567.899 here. This was posted earlier to get the currency symbol as well, without ' you don't get the separation. I'm using Unix so this is the locale.h I'm using.

    <locale.h>
    What compiler are you using? Is that standard C99? Let me see the link
    as well, but on my compiler it doesn't work.

  9. #69
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    It's part of the C standard library yes. I'm not sure about the printf single quote though, It seems to be that that your system is missing.

    I use gcc version 4.2.1
    I just tried it in clang as well and it worked just the same.

  10. #70
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Subsonics View Post
    It's part of the C standard library yes. I'm not sure about the printf single quote though, It seems to be that that your system is missing.

    I use gcc version 4.2.1
    I just tried it in clang as well and it worked just the same.
    Without the single quote before the "d" I get no formatting at all:
    Code:
    1234567899
    Press any key to continue...
    Probably it is a macro extension that GCC implements, but so far
    I don't know if it works on Windows C99 compilers. Until then I have
    to stick with the building process.

  11. #71
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by frktons View Post
    Unfortunately locale.h doesn't do anything more than changing the
    decimal point with a comma for european countries that use a different
    thousand separator.
    If I do:
    Code:
    int main(void)
    {
            setlocale(LC_NUMERIC, "");
            struct lconv *locale = localeconv();
            locale->thousands_sep = ".";
    	locale->decimal_point = ",";
    
            double value = 1234567.899;
    
            printf("%.3f\n", value);
    
    	return 0;
    }
    I get :
    Code:
    1234567,899
    Press any key to continue...
    That is what I already knew. Unfortunately nothing else for my
    compilers gets authomated.
    Last edited by frktons; 07-13-2010 at 02:52 AM.

  12. #72
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    Quote Originally Posted by frktons View Post
    Without the single quote before the "d" I get no formatting at all:
    Code:
    1234567899
    Press any key to continue...
    Me neither. Anyway this is most likely a printf issue, but look into how you print based on locale settings in windows, there must be a way.


    For some numeric conversions a radix character ('decimal point') or thousands' grouping character is used. The actual character used depends on the LC_NUMERIC part of the locale. The POSIX locale uses '.' as radix character, and does not have a grouping character. Thus,

    printf("%'.2f", 1234567.89);
    results in '1234567.89' in the POSIX locale, in '1234567,89' in the nl_NL locale, and in '1.234.567,89' in the da_DK locale.

  13. #73
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Up to this point I am unable to get anything useful from
    locale.h on my C99 Windows compiler, so I carry on with the idea of
    testing all the possible combinations of negative and positive
    numbers inside the integer range.

    I've imagined many solutions to test the correctness of the generated
    strings. One of them consists in generating the strings from
    -2,147,483,648 to +2,147,483,647 using 10 nested loops from 0 to 9, except
    the most external one, and a lookup table of 10 char elements with the
    digits "0" => "9".

    Building the strings this way should be fast and reliable enough to use
    them for comparing the results I get from the itoas() function to test.

    I'll work on it and see if the task is a viable one and the performance is
    good enough to attach it to the testpad.

    If you have any better idea, or something to warn me about you are
    always welcome.

    Bye for now

  14. #74
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    A pre-test looks promising:

    Code:
                        Testing version : Nested_chars
                     ------------------------------------
     Testing on Intel Core 2 Duo E6600 2.4 Ghz // IA32
     OS = Windows 7 Ultimate 64 bit  --  Compiler = Pelles C 6.00.4
    
     Testing the numbers from: 0  to  4,000,000,000
    
     The generating process has taken:  12,933 ms  to perform the whole cycle
    with this nested loops:

    Code:
         for (digit[0] = 0; digit[0] < four; digit[0]++){
             for (digit[1] = 0; digit[1] < ten; digit[1]++){
                 for (digit[2] = 0; digit[2] < ten; digit[2]++){
    	          for (digit[3] = 0; digit[3] < ten; digit[3]++){
    	              for (digit[4] = 0; digit[4] < ten; digit[4]++){
    	                  for (digit[5] = 0; digit[5] < ten; digit[5]++){
    	          		for (digit[6] = 0; digit[6] < ten; digit[6]++){
    			          for (digit[7] = 0; digit[7] < ten; digit[7]++){
    				       for (digit[8] = 0; digit[8] < ten; digit[8]++){
    				           for (digit[9] = 0; digit[9] < ten; digit[9]++){
    						times++;
    	          			    } // end digit[9] for
    	          			} // end digit[8] for
    	         		  } // end digit[7] for
    	          		} // end digit[6] for
    	                  } // end digit[5] for
    	              } // end digit[4] for
    	          } // end digit[3] for
    	     } // end digit[2] for	  
    	  } // end digit[1] for
         } // end digit[0] for
    Of course now there are a lot of details to deal with, but
    the whole is going to take less than 10' I guess.

  15. #75
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Code:
    void bar( char s[], long l )
    {
        char buf[ BUFSIZ ] = {0};
        char *p = buf, *q = s;
        size_t comma = 0, trail = 0;
        
        sprintf( buf, "%ld", l );
    
        if( *p == '-' )
            *q++ = *p++;
        
        for( trail = strlen( p ) % 3; trail > 0; trail-- )
        {
            *q++ = *p++;
        }
    
        for( comma = strlen( p ) / 3; comma > 0; comma-- )
        {
            *q++ = ',';
            *q++ = *p++;
            *q++ = *p++;
            *q++ = *p++;
        }
        *q = '\0';
    }

    Quzah.
    Last edited by quzah; 07-13-2010 at 03:48 PM.
    Hope is the first step on the road to disappointment.

Popular pages Recent additions subscribe to a feed