Thread: Is there a standard function?

  1. #46
    Registered User
    Join Date
    Jun 2010
    Posts
    182

    Formatting integer numbers

    I added the possibility of choice between Right-alignment and Left-alignment
    of the formatted number. We are at version 0.4 so far.
    The new version:
    Code:
    // ----------------------------------------------------------------------------------------------
    // Prog_name: comma_sep.c / ver 0.4
    // A routine for formatting integer numbers with thousand separators. 
    // Created with Pelles C for Windows 6.00.4
    //-----------------------------------------------------------------------------------------------
    // This version takes into account the sign and the possibility to have different
    // separator like space, comma, point, and so on. 
    // Moveover this version creates a function that can be tested for performance. 
    // Added the choice for right and left alignment.
    //-----------------------------------------------------------------------------------------------
    // Date: 03 july 2010
    // Author: frktons @ cprogramming forum.
    //-----------------------------------------------------------------------------------------------
    
    #include <stdio.h>
    #include <stdbool.h>
    #include <time.h>
    #undef getchar
    #pragma comment(lib, "\\masm32\\lib\\msvcrt.lib")
    
    //-----------------------------------------------------------------------------------------------
    // Global variables. In future versions will go into an header file
    //-----------------------------------------------------------------------------------------------
        bool sign = true;              // if the number has to be displayed with sign
        bool neg = false;              // if the number is negative this flag will be set
                                                  // from the function to reflect the state
        bool zero = false;             // if the number is zero the function sets this
                                                  // flag for its decisions
        char sep = ',';                   // here I choose the separator 
        char buffer[15] = {' '};        // string array for the formatted number
        char alignment = 'R';       // choice to [R]ight-align or [L]eft-align the number  
    
    //-----------------------------------------------------------------------------------------------
    // Function prototype for int_format(). Takes the number to format
    // and the address of the string to fill. Returns nothing.
    //-----------------------------------------------------------------------------------------------
    
    void int_format(int num);
    
    //-----------------------------------------------------------------------------------------------
    
    int main(int argc, char *argv[])
    {
     
        int num = -123456;          // test number
        int cycles = 0;
    
    
        int x;                                  // generic integer counter
        time_t init_time = 0;          // initial time
        time_t end_time = 0;        // end time 
        alignment = 'L';               // the formatted number will be Left-aligned
     
        int times = 50000000;      // for testing performance set this number for repetitions
    
    
        printf("\n The value of num is: %d\n",num);
    
        time(&init_time);
    
        printf("\n init_time = %d \n",init_time);    
    
        for (cycles = 0; cycles < times; cycles++) {
            for (x=0; x < 14; x++)
                buffer[x] = ' '; 
            int_format(num);
        } // end for
        
        printf("\n The formatted value of num is: ");
        for (x = 0; x < 14; x++)
        {
            if (buffer[x] == '+' || buffer[x] == '-' || buffer[x] == sep || buffer[x] >= '0' && buffer[x] <= '9')
            printf("%c",buffer[x]);	
        }
        printf("\n\n");
        time(&end_time);
        printf(" end_time  = %d \n",end_time);    
    
        printf("\n\n The routine test has taken about %d seconds\n", end_time - init_time);
        for (x=0; x < 14; x++)
            buffer[x] = ' '; 
    //    buffer[14] = '\0';
        neg = false;
        sign = false;
        int_format(times);
        printf("\n to perform ");
        for (x = 0; x < 14; x++)
        {
            if (buffer[x] == '+' || buffer[x] == '-' || buffer[x] == sep || buffer[x] >= '0' && buffer[x] <= '9')
            printf("%c",buffer[x]);	
        }	
        printf(" cycles of the formatting function");
        getchar();
        return 0;
    }
    //------------------------------------------------------------------------------------------------
    // Function int_format()
    //------------------------------------------------------------------------------------------------
    
    void   int_format(int num){
    
        int x = 0;
        int y = 0;
        int remain = 0;                  // integer variable to store the remainder
        int count = 0;                    // integer counter for positioning the separator  
        char digit[10] = {'0','1','2','3','4','5','6','7','8','9'};   // the digits to display in char shape 
        int len_str = 14;                 // string lenght less the terminator NULL
    
        if (num != 0) 
        {
            if (num < 0)
            {
                neg = true;
                num = num * -1 ; // transform number to positive if negative
            }
            for (x = len_str; x >= 0; x--)
             {
                if (num == 0) 
                    break;
                if (count == 3)
    	   {
    	       count = 0;
    	       buffer[x] = sep;
    	       x--;
                }
    
                remain = num % 10;
                num = num / 10;
                buffer[x] = digit[remain];
                count++;
              }  
        }
        else
        {
            buffer[len_str] = '0';
            zero = true;
        }
    
        if (sign == true && neg == true)
            buffer[x] = '-';
        else if (sign == true && zero == false)
            buffer[x] = '+';
    
         if  (alignment  == 'L') {
             if (x == 0)
    	     return;
             for (y = 0; y < 15; y++, x++) {
                 buffer[y] = buffer[x];
                 if (buffer[y] == '\0')
    	         return;
             } // end for
    
         }
    
        return;
    
    }

    The actual performance, before optimization, is:

    Code:
     The value of num is: -123456
    
     init_time = 1278177849
    
     The formatted value of num is: -123,456
    
     end_time  = 1278177853
    
    
     The routine test has taken about 4 seconds
    
     to perform 50,000,000 cycles of the formatting function
    Looking forward for version 0.5 :-)

  2. #47
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Looking forward to eliminated global variables.
    Also, there is a trick in the book. If you want to convert a digit to a char, just do '0' + digit. No need for a lookup array.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #48
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Elysia View Post
    Looking forward to eliminated global variables.
    Also, there is a trick in the book. If you want to convert a digit to a char, just do '0' + digit. No need for a lookup array.
    Good to know. Thanks again. There will be a version with local variables
    as well, not so far in the future. :-)
    I'll use a char + an integer, but it surely works. This should be slower than
    copying a byte from the lookup array to the string array.
    If I were concerned with the performance issue I'd not use it.
    I'm probably done with the new features of the function, so from now
    on I can take care of the optimization and standardization task.
    Last edited by frktons; 07-04-2010 at 12:23 AM.

  4. #49
    Registered User
    Join Date
    Jun 2010
    Posts
    182

    Formatting integer numbers

    And now we are about half-way: 0.5 version.
    I've implemented the suggestions given by some of you
    about global and local variables, and replaced the lookup array
    with the 'trick of the tale' Elysia pointed to.

    The performances are different, but only with CPU cycles
    counting I'll be sure about that.

    Code:
    // ----------------------------------------------------------------------------------------------
    // Prog_name: comma_sep.c / ver 0.5
    // A routine for formatting integer numbers with thousand separators. 
    // Created with Pelles C for Windows 6.00.4
    //-----------------------------------------------------------------------------------------------
    // This version takes into account the sign and the possibility to have different
    // separator like space, comma, point, and so on. 
    // Moveover this version creates a function that can be tested for performance. 
    // Added the choice for right and left alignment.
    // In this version Global variables have been removed.
    //-----------------------------------------------------------------------------------------------
    // Date: 04 july 2010
    // Author: frktons @ cprogramming forum.
    //-----------------------------------------------------------------------------------------------
    
    #include <stdio.h>
    #include <stdbool.h>
    #include <time.h>
    #undef getchar
    #pragma comment(lib, "\\masm32\\lib\\msvcrt.lib")
    
    
    
    //-----------------------------------------------------------------------------------------------
    // Function prototype for int_format(). Takes the number to format,
    // the address of the string to fill, some switches and thousand separator. 
    // Returns nothing.
    //-----------------------------------------------------------------------------------------------
    
    void int_format(int num, char *, bool sign, bool neg, char sep, char alignment);
    
    //-----------------------------------------------------------------------------------------------
    
    int main(int argc, char *argv[])
    {
    
    //-----------------------------------------------------------------------------------------------
    // Local variables. 
    //-----------------------------------------------------------------------------------------------
        bool sign = true;              // if the number has to be displayed with sign
        bool neg = false;              // if the number is negative this flag will be set
                                                  // from the function to reflect the state
    
        char sep = ',';                   // here I choose the separator 
        char buffer[15] = {' '};        // string array for the formatted number
        char alignment = ' ';         // choice to [R]ight-align or [L]eft-align the number   
        int num = -123456;           // test number
        int cycles = 0;
    
    
        int x;                                  // generic integer counter
        time_t init_time = 0;          // initial time
        time_t end_time = 0;        // end time 
        alignment = 'L';               // the formatted number will be Left-aligned
     
        int times = 50000000;      // for testing performance set this number for repetitions
    
    
        printf("\n The value of num is: %d\n",num);
    
        time(&init_time);
    
        printf("\n init_time = %d \n",init_time);    
    
        for (cycles = 0; cycles < times; cycles++) {
            for (x=0; x < 14; x++)
                buffer[x] = ' '; 
            int_format(num, buffer, sign, neg, sep, alignment);
        } // end for
        
        printf("\n The formatted value of num is: ");
        for (x = 0; x < 14; x++)
        {
            if (buffer[x] == '+' || buffer[x] == '-' || buffer[x] == sep || buffer[x] >= '0' && buffer[x] <= '9')
            printf("%c",buffer[x]);	
            if (buffer[x] == '\0')
               break;
        }
        printf("\n\n");
        time(&end_time);
        printf(" end_time  = %d \n",end_time);    
    
        printf("\n\n The routine test has taken about %d seconds\n", end_time - init_time);
        for (x=0; x < 14; x++)
            buffer[x] = ' '; 
    //    buffer[14] = '\0';
        neg = false;
        sign = false;
        int_format(times, buffer, sign, neg, sep, alignment);
        printf("\n to perform ");
        for (x = 0; x < 14; x++)
        {
            if (buffer[x] == '+' || buffer[x] == '-' || buffer[x] == sep || buffer[x] >= '0' && buffer[x] <= '9')
               printf("%c",buffer[x]);	
            if (buffer[x] == '\0')
               break;
        }	
        printf(" cycles of the formatting function");
        getchar();
        return 0;
    }
    //--------------------------------------------------------------------------------------------------------------------
    // Function int_format()
    //--------------------------------------------------------------------------------------------------------------------
    
    void   int_format(int num, char *buffer, bool sign, bool neg, char sep, char alignment){
    
        int x = 0;
        int y = 0;
        int remain = 0;                  // integer variable to store the remainder
        int count = 0;                    // integer counter for positioning the separator  
        bool zero = false;             // if the number is zero the function sets this
                                                  // flag for its decisions
    //    char digit[10] = {'0','1','2','3','4','5','6','7','8','9'};   // the digits to display in char shape 
        int len_str = 14;                 // string lenght less the terminator NULL
    
        if (num != 0) 
        {
            if (num < 0)
            {
                neg = true;
                num = num * -1 ; // transform number to positive if negative
            }
            for (x = len_str; x >= 0; x--)
             {
                if (num == 0) 
                    break;
                if (count == 3)
    	   {
    	       count = 0;
    	       buffer[x] = sep;
    	       x--;
                }
    
                remain = num % 10;
                num = num / 10;
    //            buffer[x] = digit[remain];
                buffer[x] = '0' + remain;
                count++;
              }  
        }
        else
        {
            buffer[len_str] = '0';
            zero = true;
        }
    
        if (sign == true && neg == true)
            buffer[x] = '-';
        else if (sign == true && zero == false)
            buffer[x] = '+';
    
         if  (alignment  == 'L') {
             if (x == 0)
    	     return;
             for (y = 0; y < 15; y++, x++) {
                 buffer[y] = buffer[x];
                 if (buffer[y] == '\0')
    	         return;
             } // end for
    
         }
    
        return;
    
    }
    The performance gap:

    Code:
                             Testing version : 0.40
    
                             ----------------------
    
     The value of num is: -1234567890
    
     init_time = 1278231197
    
     The formatted value of num is: -1,234,567,890
    
     end_time  = 1278231263
    
    
     The routine test has taken about 66 seconds
    
     to perform 500,000,000 cycles of the formatting function
    versus:

    Code:
                             Testing version : 0.50
    
                             ----------------------
    
     The value of num is: -1234567890
    
     init_time = 1278231354
    
     The formatted value of num is: -1,234,567,890
    
     end_time  = 1278231447
    
    
     The routine test has taken about 93 seconds
    
     to perform 500,000,000 cycles of the formatting function
    It looks like version 0.4 is about 35% faster than version 0.5.
    Last edited by frktons; 07-04-2010 at 02:22 AM.

  5. #50
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Your best version is still to come. It will feature the result of several keypresses on that big key with the left facing arrow on it. (Yes, on the upper right hand side of the keyboard).

    Not only will it improve the performance, and make your program smaller, but it will remove redundancies like alignment, that are already a part of the printf() format specifiers.

  6. #51
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    You're still missing the sizeof_buffer argument. That is a no-no. You MUST have it or you must not use C.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #52
    Registered User
    Join Date
    Jun 2010
    Posts
    182

    Formatting integer numbers

    Quote Originally Posted by Adak View Post
    Your best version is still to come. It will feature the result of several keypresses on that big key with the left facing arrow on it. (Yes, on the upper right hand side of the keyboard).

    Not only will it improve the performance, and make your program smaller, but it will remove redundancies like alignment, that are already a part of the printf() format specifiers.
    Ah! Ah! I really like your sense of humour. :-)
    I think I've already heard that. Aren't you, by any chance, reinventing the joke? ;-)

    You are a very experienced C coder, and I'd appreciate some more constructive
    suggestions for my learning path, if you don't bother.

    Quote Originally Posted by Elysia
    You're still missing the sizeof_buffer argument. That is a no-no. You MUST have it or you must not use C.
    Sorry about that, but if I MUST than I WILL :-)
    Let me study the matter, I'm still a very beginner in C syntax, I don't know yet.

    By the way, except from deleting the program as Adak suggested, what kind
    of performance improvement could you suggest to make the function faster?
    My actual objective is to reduce the elapsed time from 66'' to 32'' or less.
    Last edited by frktons; 07-04-2010 at 11:30 AM.

  8. #53
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    First off: do you absolutely have to optimize this further?
    We're hitting micro optimizations soon. This means that the optimizations may be subjective to different machines and can hurt code readability and involve assembly, even.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #54
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Elysia View Post
    First off: do you absolutely have to optimize this further?
    We're hitting micro optimizations soon. This means that the optimizations may be subjective to different machines and can hurt code readability and involve assembly, even.
    I'm afraid I'm going that way, but not before trying to write a better algorithm
    or using better C options. Optimization is, in reality, the main aim of this experiment.
    As you surely know, you are a far more experienced C programmer than me, there are
    already many itoa() implementations and the like. Nothing as complete as the function I'm
    trying to assemble, but close enough.

    I'm using Windows, and I'm going to study and implement win32 APIs.
    Assembly, when needed, should be used as well. If you are not concerned
    about portability, those are viable options as well.
    Last edited by frktons; 07-04-2010 at 11:38 AM.

  10. #55
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    After looking at the source in a profiler, a few lines come up as bottlenecks.
    Three specific lines are:

    remain = num % 10;
    num = num / 10;
    buffer[x] = '0' + (char)remain;

    This requires some intense assembly or bit manipulation to optimize. I'm not sure how.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #56
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Elysia View Post
    After looking at the source in a profiler, a few lines come up as bottlenecks.
    Three specific lines are:

    remain = num % 10;
    num = num / 10;
    buffer[x] = '0' + (char)remain;

    This requires some intense assembly or bit manipulation to optimize. I'm not sure how.
    Well for the third one we already know:
    better to use a lookup array.
    The others are a little bit more tricky to manage.
    I'll think about that. Maybe the C function that divide an integer and returns
    the remainder as well could be faster?
    Last edited by frktons; 07-04-2010 at 11:40 AM.

  12. #57
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by frktons View Post
    better to use a lookup array.
    Not necessarily. It takes up 4% with the integer math which translates to one instruction.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  13. #58
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Elysia View Post
    Not necessarily. It takes up 4% with the integer math which translates to one instruction.
    Let's find another solution then. :-)

    With the lookup array, the 0.5 version, with local variables, takes 79''
    versus 93'' of buffer[x] = '0' + (char)remain;. Better to stick with
    the array then.

    Code:
                             Testing version : 0.50
    
                             ----------------------
    
     The value of num is: -1234567890
    
     init_time = 1278265691
    
     The formatted value of num is: -1,234,567,890
    
     end_time  = 1278265770
    
    
     The routine test has taken about 79 seconds
    
     to perform 500,000,000 cycles of the formatting function
    About the parameter you were referring in a previous post, I assumed:
    Code:
        int len_str = 14;                 // string lenght less the terminator NULL
    in the body of the function, because I already know the max lenght the
    string can be.
    Last edited by frktons; 07-04-2010 at 11:56 AM.

  14. #59
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Did you run that test sufficiently many times? The results may spook.
    It also depends heavily if the array is put into the cache or not.
    Another idea is to use log10 to get the length of the number and skip the whole aligning step. I don't know if it's faster.
    Another idea is that try using dynamic programming by taking your number, dividing by 10, storing into a double or float, save the reminder and the result into an array. All the while also determining the length of the number to skip the aligning step.
    Then use these arrays instead of performing the calculations later. These 3 combined might save time.
    The only problem is getting the reminder easily.

    Base 10 is a huge problem in computers. Were it base 16, it would be very easy.

    You can also check for 0 right away, and if it is, copy "0" into the buffer and return.
    That means you don't have to keep a zero variable and you can safely code without having to keep in mind that the number may be 0.
    Last edited by Elysia; 07-04-2010 at 11:58 AM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #60
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Elysia View Post
    Did you run that test sufficiently many times? The results may spook.
    It also depends heavily if the array is put into the cache or not.
    Another idea is to use log10 to get the length of the number and skip the whole aligning step. I don't know if it's faster.
    Another idea is that try using dynamic programming by taking your number, dividing by 10, storing into a double or float, save the reminder and the result into an array. All the while also determining the length of the number to skip the aligning step.
    Then use these arrays instead of performing the calculations later. These 3 combined might save time.
    The only problem is getting the reminder easily.

    Base 10 is a huge problem in computers. Were it base 16, it would be very easy.

    You can also check for 0 right away, and if it is, copy "0" into the buffer and return.
    That means you don't have to keep a zero variable and you can safely code without having to keep in mind that the number may be 0.
    Well, yes Elysia. As far as I know typecasting is quite a resource consuming
    process when compared to simply moving a byte from a location to another.

    I'd keep away from float for similar reasons.

    Let me try something with register and I'll tell you if it produces any result at all.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Getting an error with OpenGL: collect2: ld returned 1 exit status
    By Lorgon Jortle in forum C++ Programming
    Replies: 6
    Last Post: 05-08-2009, 08:18 PM
  2. Change this program so it uses function??
    By stormfront in forum C Programming
    Replies: 8
    Last Post: 11-01-2005, 08:55 AM
  3. const at the end of a sub routine?
    By Kleid-0 in forum C++ Programming
    Replies: 14
    Last Post: 10-23-2005, 06:44 PM
  4. c++ linking problem for x11
    By kron in forum Linux Programming
    Replies: 1
    Last Post: 11-19-2004, 10:18 AM
  5. Replies: 5
    Last Post: 02-08-2003, 07:42 PM