Thread: Is there a standard function?

  1. #106
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Actually, I took a look with the profiler. I got 2 L2 cache misses.
    The rest was L2 cache hits and retired instructions (whatever that means).
    Nevertheless, it was faster.

    For the sake of it, my current code:
    Code:
    // ----------------------------------------------------------------------------------------------
    // Prog_name: comma_sep.c / ver 0.5
    // A routine for formatting integer numbers with thousand separators. 
    // Created with Pelles C for Windows 6.00.4
    //-----------------------------------------------------------------------------------------------
    // This version takes into account the sign and the possibility to have different
    // separator like space, comma, point, and so on. 
    // Moveover this version creates a function that can be tested for performance. 
    // Added the choice for right and left alignment.
    // In this version Global variables have been removed.
    //-----------------------------------------------------------------------------------------------
    // Date: 04 july 2010
    // Author: frktons @ cprogramming forum.
    //-----------------------------------------------------------------------------------------------
    
    #include <stdio.h>
    #include <time.h>
    #include <math.h>
    #include <string.h>
    
    //-----------------------------------------------------------------------------------------------
    // Function prototype for int_format(). Takes the number to format,
    // the address of the string to fill, some switches and thousand separator. 
    // Returns nothing.
    //-----------------------------------------------------------------------------------------------
    
    inline int GetNumLength(int num)
    {
    	int len_str = 0;
    	int temp_num = num;
    	while (temp_num /= 10) len_str++;
    	return len_str + 1;
    }
    
    void int_format(int num, char * buffer, bool sign, char sep);
    
    //-----------------------------------------------------------------------------------------------
    
    char digits_table[999999][8];
    
    int main()
    {
    	for (int i = 0; i < 999999; i++)
    	{
    		for (int j = 0; j < 8; j++)
    		{
    			digits_table[i][0] = ',';
    			digits_table[i][1] = '0' + char(i / 100000);
    			digits_table[i][2] = '0' + char(i / 10000 % 10);
    			digits_table[i][3] = '0' + char(i / 1000 % 10);
    			digits_table[i][4] = ',';
    			digits_table[i][5] = '0' + char(i / 100 % 10);
    			digits_table[i][6] = '0' + char(i / 10 % 10);
    			digits_table[i][7] = '0' + char(i % 10);
    		}
    	}
    	//-----------------------------------------------------------------------------------------------
    	// Local variables. 
    	//-----------------------------------------------------------------------------------------------
    	bool sign = true;              // if the number has to be displayed with sign
    	// from the function to reflect the state
    
    	char sep = ',';                   // here I choose the separator 
    	char buffer[30] = {0};        // string array for the formatted number
    	//char alignment = ' ';         // choice to [R]ight-align or [L]eft-align the number   
    	int num = -1234567890;           // test number
    	int cycles = 0;
    
    	//int len_str = GetNumLength(num);
    
    	int x;                                  // generic integer counter
    	time_t init_time = 0;          // initial time
    	time_t end_time = 0;        // end time 
    	//alignment = 'L';               // the formatted number will be Left-aligned
    
    	int times = 500000000;      // for testing performance set this number for repetitions
    
    
    	printf("\n The value of num is: %d\n",num);
    
    	time(&init_time);
    
    	printf("\n init_time = %d \n",init_time);    
    
    	for (cycles = 0; cycles < times; cycles++) {
    		memset(buffer, ' ', sizeof(buffer));
    		int_format(num, /*len_str, */buffer, sign, sep/*, alignment*/);
    	} // end for
    
    	printf("\n The formatted value of num is: %s", buffer);
    
    	printf("\n\n");
    	time(&end_time);
    	printf(" end_time  = %d \n",end_time);    
    
    	printf("\n\n The routine test has taken about %d seconds\n", end_time - init_time);
    	memset(buffer, ' ', sizeof(buffer));
    	int_format(times, buffer, sign, sep);
    	printf("\n to perform %s cycles of the formatting function\n", buffer);
    
    	return 0;
    }
    //--------------------------------------------------------------------------------------------------------------------
    // Function int_format()
    //--------------------------------------------------------------------------------------------------------------------
    
    __forceinline void   int_format(int num, /*int length_of_num, */char *buffer, bool sign, /*bool neg, */char sep/* char alignment*/){
    
    	int remain = 0;                  // integer variable to store the remainder
    	bool neg = false;
    	const int sizeof_buffer = 30;
    
    	if (num == 0)
    	{
    		strcpy(buffer, "0");
    		return;
    	}
    	if (num < 0)
    	{
    		neg = true;
    		buffer[0] = '-';
    		num = -num; // transform number to positive if negative
    
    	}
    	else
    	{
    		if (sign) buffer[0] = '+';
    	}
    
    	int x;
    	for (x = sizeof_buffer - 1; num; x -= 8)
    	{
    		remain = num % 1000000;
    		num /= 1000000;
    		memcpy(&buffer[x - 8], digits_table[remain], 8);
    	} 
    
    	int copy_to = 0;
    	if (neg || sign) copy_to++;
    	memmove(&buffer[copy_to], &buffer[x + 1], sizeof_buffer - x - 1);
    	buffer[sizeof_buffer - x - 1] = '\0';
    
    	return;
    }
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  2. #107
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Ah! of course.

    We are running it on the same number over and over . It's not random at all.

    If you do it on a bunch of random numbers, it should be much slower.

    The 2 cache misses are probably in the first function call. The digits table would be mostly flushed out of cache by then, because of the order you are building it.

    -1234567890 takes 2 lookups.
    Last edited by cyberfish; 07-04-2010 at 05:02 PM.

  3. #108
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Let me catch up here, I've been messing with hardware and RL for a bit.

    Is this still the objective?
    I recently started studying C, and so far I didn't find a function to format
    integer numbers with thousand comma separators:
    And no, I wasn't joking about deleting some of this code, if this is the objective. I did not say to delete the entire program!

    The number you want to test on is: -1234567890, and you want to format it 500 Million times, for timing/testing purposes, right?

    I'll put together some code for it, and show you what I mean.

  4. #109
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Hmmm. Well, not feeling up to try right now.
    The OP is trying to optimize his/her 1000 separator formatting function.
    Basically we've been able to hit 8 seconds. Still trying to figure if it's possible to optimize it more. Preferably getting rid of the modulus.

    EDIT: Ran cache test again.
    I got 5 misses this time (not inlining code this time).
    The lines that were causing it were:

    Code:
     	Address     	Line 	Source                                       	Code Bytes 	Ret inst 	L2 fill/writeback 	L2 requests 	L2 misses 	
     	0x13f1f1290 	107  	                                             	           	16966    	0                 	2           	1         	
     	            	108  	__forceinline void   int_format(...){ 	           	0        	0                 	0           	0         	
     	0x13f1f12ab 	131  	    int x;                                   	           	22588    	1                 	0           	2         	
     	            	132  	    for (x = sizeof_buffer - 1; num; x -= 8) 	           	0        	0                 	0           	0         	
    
    1 file, 1 function, 4 lines, 0 instructions, Summary: 39560 samples, 26.00% of module samples, 13.84% of total samples
    So it probably has something to do with the assembly, and that makes it tricky to find the actual cause.
    Last edited by Elysia; 07-04-2010 at 05:11 PM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #110
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    You'd think sprintf or something would make this a whole lot easier to implement.

  6. #111
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    Didn't someone post a printf solution earlier? It seems to be gone now.

  7. #112
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    @Elysia
    My compiler says the following are illegal:
    Code:
    			digits_table[i][1] = '0' + char(i / 100000);
    			digits_table[i][2] = '0' + char(i / 10000 % 10);
    			digits_table[i][3] = '0' + char(i / 1000 % 10);
    			digits_table[i][4] = ',';
    			digits_table[i][5] = '0' + char(i / 100 % 10);
    			digits_table[i][6] = '0' + char(i / 10 % 10);
    			digits_table[i][7] = '0' + char(i % 10);
    Are these standard C99 instructions? :-(

    @Adak
    This is a good idea.
    Let me catch up here, I've been messing with hardware and RL for a bit.

    Is this still the objective?
    Quote:
    I recently started studying C, and so far I didn't find a function to format
    integer numbers with thousand comma separators:
    And no, I wasn't joking about deleting some of this code, if this is the objective. I did not say to delete the entire program!

    The number you want to test on is: -1234567890, and you want to format it 500 Million times, for timing/testing purposes, right?

    I'll put together some code for it, and show you what I mean.
    @cyberfish
    Yes, that is one of the options I was thinking about, we need to handle the last remainder
    if less than 100, with leading blanks/zeroes. :-)
    One optimization I would do is to precompute a mapping of integers 0-999 to their 3 characters strings.

    That would take 1000*3 = ~3KB of memory. No null terminators needed because you know they are exactly 3 characters long.

    Then you can do % 1000 instead of % 10. I'm guessing that will make it at least twice as fast.

    It will also make the logic simpler (since you want a comma every 3 digits), and eliminate a bunch of branching, which are also slow with modern processors.

    And switching to a modern compiler will make this all not matter .

    @others
    If you have viable alternatives, please feel free to partecipate with
    your code and suggestions, not to speak about testing it and posting
    the results you get. :-)
    Last edited by frktons; 07-05-2010 at 10:16 AM.

  8. #113
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Well, sometimes a little math and some recursion can be a beautiful thing.

    Code:
    #include <stdio.h>
    #include <math.h>
    
    void bar (unsigned long n , char sep)
    {
        if (n > 0UL) {
            bar (n / 1000UL , ',');
            printf ("%lu%c" , n % 1000UL , sep);
            /* If you can print it you can copy it */
        }
    }
    
    void foo (long n)
    {
        unsigned long m = n;
        if (n < 0) {
            m = n * -1L;
            printf ("%c" , '-');
        }
    
        bar (m , '\n');
    }
    
    int main (void)
    {
        foo (-1500300L);
        foo (1500300L);
        return 0;
    }
    The sprintf version is not much different, but I have to leave you something to figure out. Can't be doing your homework.
    Last edited by whiteflags; 07-05-2010 at 12:39 AM.

  9. #114
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Eylsia, you've made the classic mistake of not handling -2147483648 (0x80000000) correctly.
    When you find the number is negative and negate it, put the result into an unsigned number and deal with only unsigned variables from then on.

    You also don't need temp_num as well as the parameter num, in GetNumLength. Since it's passed by value there is no difference between them. Just modify num (which will of course be unsigned right )

    You do realise that a 7MB lookup table is absurd right?
    Last edited by iMalc; 07-05-2010 at 12:15 AM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  10. #115
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Wow! Who'd a thunk - a race to print 500 million huge numbers, with comma's!!

    This one takes a smarter route, and only prints the big number once, unless it has changed. That brings the time down to 1-2 seconds.

    Code:
    /*
    
    I recently started studying C, and so far I didn't find a function to format
    integer numbers with thousand comma separators:
    
    The number you want to test on is: -1234567890, and you want to format it 
    500 Million times, for timing/testing purposes, right?
    
    -2147483648 for num, fails :(
     
    */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    int main() {
      clock_t start, stop;
      char neg; 
      char s[15] = { '\0' };
      
      long num = -123, orig_num=0; 
      unsigned long i, num2;
    /* for testing 
      num = -1234;
      num = -12345;
      num =  -123456;
      num =  1234567;
      num =  -12345678;
      num =  123456789;
    */
      num =  -1234567890; //1,234,567,890
    //  num =  -2147483648; //this fails 2147483648 * -1 > 9999999999 ??
    
      printf("\n\n");
      start=clock();
      for(i=0;i<500000000;i++) {
        if(num==orig_num)
          continue;
        if(num<0) {
          neg='-';
          num2=(num* -1);
        }
        else {
          neg=' ';
          num2=num;
        }
        ultoa(num2, s, 10);
        printf("\n%c%s", neg,s);
        putchar('\n');
        putchar(neg);
        if(num2>9999999999) {   //10 Billion to 99 Billion
          cprintf("%c%c,%c%c%c,%c%c%c,%c%c%c", 
          s[0],s[1],s[2],s[3],s[4],s[5],s[6],s[7],s[8],s[9],s[10]);
        }
        else if(num2>999999999) {   //1 Billion to 9 Billion
          cprintf("%c,%c%c%c,%c%c%c,%c%c%c", 
          s[0],s[1],s[2],s[3],s[4],s[5],s[6],s[7],s[8],s[9]);
        }
        else if(num2>99999999) { //100 Million to 999 Million (US)
          cprintf("%c%c%c,%c%c%c,%c%c%c", 
          s[0],s[1],s[2],s[3],s[4],s[5],s[6],s[7],s[8]);
        }
        else if(num2>9999999) { //10 Million to 99 Million (US)
          cprintf("%c%c,%c%c%c,%c%c%c", s[0],s[1],s[2],s[3],s[4],s[5],s[6],s[7]);
        }
        else if(num2>999999) { //1 Million to 9 Million (US)
          cprintf("%c,%c%c%c,%c%c%c", s[0],s[1],s[2],s[3],s[4],s[5],s[6]);
        }
        else if(num2>99999) { //100 Thousand to 999 Thousand (US)
          cprintf("%c%c%c,%c%c%c", s[0],s[1],s[2],s[3],s[4],s[5]);
        }
        else if(num2>9999) { //10 Thousand to 99 Thousand (US)
          cprintf("%c%c,%c%c%c", s[0],s[1],s[2],s[3],s[4]);
        }
        else if(num2>999) { //1 Thousand to 9 Thousand (US)
          cprintf("%c,%c%c%c", s[0],s[1],s[2],s[3]);
        }
        else  //less than 1 Thousand
          cprintf("%s", s);
        orig_num=num;
        //if num is to be changed, that
        //code would go right here
      }
      stop = clock();
      printf("\nElapsed Time: %f", (stop-start)/CLK_TCK);
      printf("\n\n\t\t\t     press enter when ready");
    
      i = getchar(); ++i;
      return 0;
    }
    @iMalc, what's the solution to this 2147483648 * -1 > 9999999999 problem?
    Last edited by Adak; 07-05-2010 at 04:23 AM.

  11. #116
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by frktons View Post
    @Elysia
    My compiler says the following are illegal:
    Code:
    			digits_table[i][1] = '0' + char(i / 100000);
    			digits_table[i][2] = '0' + char(i / 10000 % 10);
    			digits_table[i][3] = '0' + char(i / 1000 % 10);
    			digits_table[i][4] = ',';
    			digits_table[i][5] = '0' + char(i / 100 % 10);
    			digits_table[i][6] = '0' + char(i / 10 % 10);
    			digits_table[i][7] = '0' + char(i % 10);
    Are these standard C99 instructions? :-(
    No.
    Does
    Code:
    digits_table[i][1] = '0' + (char)(i / 100000)
    Work?

    Also, please use quotes for quoting text and not code tags.

    Quote Originally Posted by iMalc View Post
    Eylsia, you've made the classic mistake of not handling -2147483648 (0x80000000) correctly.
    When you find the number is negative and negate it, put the result into an unsigned number and deal with only unsigned variables from then on.

    Aye? Does it matter? If so, how?

    You also don't need temp_num as well as the parameter num, in GetNumLength. Since it's passed by value there is no difference between them. Just modify num (which will of course be unsigned right )
    I know. I just broke out some code and copied it into the function.
    But the compiler will probably optimize it anyway. Plus it's test code and time isn't spent there.

    You do realise that a 7MB lookup table is absurd right?
    True, well, maybe. It was a test and it turned out to be faster for this particular number.
    But I think I've made a classic mistake. I've only tried for one particular input.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  12. #117
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Does it matter? If so, how?
    O_o

    Would the answer to these questions be more obvious if your code didn't handle a different integer value correctly?

    Soma

  13. #118
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Could you please add some comments about your code
    in order to help C n00bs like me to get a better idea of what
    and why you use a kind of code instead of a different one?

    Thanks

  14. #119
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    What part are you confused about?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  15. #120
    Registered User
    Join Date
    Jun 2010
    Posts
    182
    Quote Originally Posted by Elysia View Post
    What part are you confused about?
    Whiteflags and Adak, for example, posted alternative ways of
    doing the same task, and I don't know some of the code they used,
    the syntax is only partially clear for me.

    The L or UL terminated numbers are quite new for me, and they
    just printed the result, not explaining what performance and why they
    obtained that. The recursive function is another new stuff I'm not accustomed
    with. It is partially clear though.

    Those are really interesting ways of coding that require some studying from
    me, and I'll be glad to do it as soon as I get the time to do it. At the moment I hope
    during the week-end I'll get some free time :-)

    At the same time if the code is followed with some explanation, it
    could be a lot easier and give me some good new point of view as well.

    Experienced programmers tend to be quick to code and slow to explain.
    I cannot blame them, but I dare to ask. ;-)

    Adak code, gives me these messages:
    Code:
    C:\comma_sep_adak\comma_sep.c(48): warning #2027: Missing prototype for 'ultoa'.
    C:\comma_sep.c(53): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(57): warning #2027: Missing prototype for 'cprintf'.
    C:\\comma_sep.c(61): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(65): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(68): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(71): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(74): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(77): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(80): warning #2027: Missing prototype for 'cprintf'.
    C:\comma_sep.c(86): error #2048: Undeclared identifier 'CLK_TCK'.
    C:\comma_sep.c(86): warning #2234: Argument 2 to 'printf' does not match the format
    string; expected 'double' but found 'unsigned int'.
    when I try to compile it, so I'm just spaced out.

    If the code posted is not standard ANSI C, how could I understand what are you
    doing? How could it help me? :-(

    Whiteflags's code compiles and runs just fine. But what about performance?
    Why is it better or smarter? what tricks is it using to get a better performance?
    What system are they using? At least this last example is ANSI C standard. :-)
    Last edited by frktons; 07-05-2010 at 02:57 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Getting an error with OpenGL: collect2: ld returned 1 exit status
    By Lorgon Jortle in forum C++ Programming
    Replies: 6
    Last Post: 05-08-2009, 08:18 PM
  2. Change this program so it uses function??
    By stormfront in forum C Programming
    Replies: 8
    Last Post: 11-01-2005, 08:55 AM
  3. const at the end of a sub routine?
    By Kleid-0 in forum C++ Programming
    Replies: 14
    Last Post: 10-23-2005, 06:44 PM
  4. c++ linking problem for x11
    By kron in forum Linux Programming
    Replies: 1
    Last Post: 11-19-2004, 10:18 AM
  5. Replies: 5
    Last Post: 02-08-2003, 07:42 PM