Thread: isdigit implementation

  1. #1
    Registered Abuser
    Join Date
    Jun 2006
    Location
    Toronto
    Posts
    591

    isdigit implementation

    I'm looking for a quick implementation of ctype's isdigit. There's the classic range test between '0' and '9', but is there anything faster? (disregarding locale considerations)

    I know most library implementations take a different approach, using look-up tables; however I've read that depending on the processor and whether these are loaded into cache it can be either faster or slower. Can anyone add to this?

  2. #2
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Why not try them out and see?

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by @nthony View Post
    I'm looking for a quick implementation of ctype's isdigit. There's the classic range test between '0' and '9', but is there anything faster?
    I was interested to read that someone would thing there would be. It is hard to understand why you would want to optimize something so simple BUT, if you really want to know, try that and isdigit several million times in a loop and use either a stopwatch or difftime().
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Okay, I did it. It took *a lot* more than just a few million iterations to get a meaningful duration tho:
    Code:
    #include <stdio.h>
    #include <time.h>
    #include <ctype.h>
    
    int check_digit (char c) {
    	if ((c>='0') && (c<='9')) return 1;
    	return 0;
    }
    
    int main() {
    	int i,j,r;
    	double lapse;
    	time_t start=time(NULL), end;
    	for (i=0; i<50000000; i++) {
    		for (j=33; j<127; j++) 
    			r=check_digit(j);
    			//r=isdigit(j);	
    	}
    	end=time(NULL);
    	if (end != start) lapse=difftime(end,start);
    	printf("%d seconds\n",(int)lapse);
    	return 0;
    }
    The results:

    "check_digit" (the classic method):
    4.7 billion times took 32 seconds.

    isdigit took 24 seconds.


    (Fedora Core 10 on Intel "dual" 2.2Ghz, but this is not threaded...)
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    I tried a lookup table this way:
    Code:
    int lookup[128];
    
    int check_II (int c) {
    	if (c-48 == lookup[c]) return 1;
    	return 0;
    }
    
    int main() {
    	int i,j,r;
    	double lapse;
    	time_t start, end;
    
    	for (i=0; i<128; i++) lookup[i]=9999;
    	for (i=0; i<10; i++) lookup[i+48]=i;
    	start=time(NULL);
    	for (i=0; i<25000000; i++) {
    		for (j=33; j<127; j++) 
    			//r=check_digit(j);
    			r=check_II(j);
    			//r=isdigit(j);	
    	}
    	end=time(NULL);
    And, interestingly, it clocked in with the same time as the "classic" method (32 seconds).
    Last edited by MK27; 06-10-2009 at 09:58 AM. Reason: off by one
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  6. #6
    Banned ಠ_ಠ's Avatar
    Join Date
    Mar 2009
    Posts
    687
    Quote Originally Posted by MK27 View Post
    Okay, I did it. It took *a lot* more than just a few million iterations to get a meaningful duration tho:
    Code:
    #include <stdio.h>
    #include <time.h>
    #include <ctype.h>
    
    int check_digit (char c) {
    	if ((c>='0') && (c<='9')) return 1;
    	return 0;
    }
    
    int main() {
    	int i,j,r;
    	double lapse;
    	time_t start=time(NULL), end;
    	for (i=0; i<50000000; i++) {
    		for (j=33; j<127; j++) 
    			r=check_digit(j);
    			//r=isdigit(j);	
    	}
    	end=time(NULL);
    	if (end != start) lapse=difftime(end,start);
    	printf("%d seconds\n",(int)lapse);
    	return 0;
    }
    The results:

    "check_digit" (the classic method):
    4.7 billion times took 32 seconds.

    isdigit took 24 seconds.


    (Fedora Core 10 on Intel "dual" 2.2Ghz, but this is not threaded...)
    you could have done

    Code:
    int check_digit (char c) {
    	return (c>='0') && (c<='9');
    }
    ╔╗╔══╦╗
    ║║║╔╗║║
    ║╚╣╚╝║╚╗
    ╚═╩══╩═╝

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by ಠ_ಠ View Post
    you could have done
    Oh sure. That saves 2 seconds (1/16).
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by MK27
    Oh sure. That saves 2 seconds (1/16).
    Interesting. Are you compiling with speed optimisations turned on?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  9. #9
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by laserlight View Post
    Interesting. Are you compiling with speed optimisations turned on?
    If he did, would the loop even exist?

  10. #10
    Registered Abuser
    Join Date
    Jun 2006
    Location
    Toronto
    Posts
    591
    Quote Originally Posted by MK27 View Post
    I tried a lookup table this way:
    And, interestingly, it clocked in with the same time as the "classic" method (32 seconds).
    Hmm, that is interesting. From a look in ctype.h, I believe mingw/GCC uses a look-up table of bitmasks, along the lines of:
    Code:
    #define BIT_MASK_ISDIGIT 0x01
    int lookup[128];
    
    int check_II (int c) {
    	return lookup[c] & BIT_MASK_ISDIGIT;
    }
    
    int main() {
    	int i,j,r;
    	double lapse;
    	time_t start, end;
    
    	for ( i='0'; i<='9'; ++i) lookup[i] |= BIT_MASK_ISDIGIT;
    	start=time(NULL);
    	for (i=0; i<25000000; i++) {
    		for (j=33; j<127; j++) 
    			//r=check_digit(j);
    			r=check_II(j);
    			//r=isdigit(j);	
    	}
    	end=time(NULL);
    I'm not at home right now, if anyone can check if this is an improvement.

  11. #11
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by laserlight View Post
    Interesting. Are you compiling with speed optimisations turned on?
    I can't find that button -- is it near the dvd burner?

    I just tried -O3 -O2 and -O and they all result in a runtime of 0 seconds. Hmmm. I never use these flags. Perhaps something is wrong with them.

    Quote Originally Posted by tabstop View Post
    If he did, would the loop even exist?
    ?????????
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  12. #12
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by MK27 View Post
    I can't find that button -- is it near the dvd burner?

    I just tried -O3 -O2 and -O and they all result in a runtime of 0 seconds. Hmmm. I never use these flags. Perhaps something is wrong with them.



    ?????????
    That's the point of optimization -- don't do things you don't have to do. gcc is smart enough to realize that there's no point in computing r -- you don't keep the value or use the value in any way, so it's gone. Once that happens, it looks at the loop, realizes that there are no statements inside the loop and eliminates the loop. If you maybe put in an accumulator variable and add r to it each time, it will actually do the loop.

  13. #13
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by tabstop View Post
    That's the point of optimization -- don't do things you don't have to do. gcc is smart enough to realize that there's no point in computing r -- you don't keep the value or use the value in any way, so it's gone. Once that happens, it looks at the loop, realizes that there are no statements inside the loop and eliminates the loop. If you maybe put in an accumulator variable and add r to it each time, it will actually do the loop.
    Ah. I had to put in r or gcc gripes about the isdigit() call being "a statement with no effect"*. So to be fair, I figured all the tests should make a real assignment.

    Hopefully that is that all you meant by "If he did, would the loop even exist?" ?-- I was suddenly thinking, "Do this without a loop...I thought I already understood C somewhat..."

    Anyway, at -O the classic method is down to 15 seconds, at -O3 7 seconds.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  14. #14
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by MK27 View Post
    Ah. I had to put in r or gcc gripes about the isdigit() call being "a statement with no effect"*. So to be fair, I figured all the tests should make a real assignment.

    Hopefully that is that all you meant by "If he did, would the loop even exist?" ?-- I was suddenly thinking, "Do this without a loop...I thought I already understood C somewhat..."

    Anyway, at -O the classic method is down to 15 seconds, at -O3 7 seconds.
    That is what I meant, yes. I'm surprised you've never come across -O before, although I have a feeling you're going to be recompiling all your code shortly....

  15. #15
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by @nthony View Post
    I'm not at home right now, if anyone can check if this is an improvement.
    At -O3 it is midway between my two lame-ass methods (6-7 seconds) and isdigit (2 seconds), which is to say 4 seconds. With no optimization: 25 seconds.

    Here's something really bizarre tho. To make use of r, I used r+=function and then output "printf("%d digits found in %d seconds\n",r,(int)lapse);", just to make sure that the correct 500000000 digits are found.

    [root~/C] ./a.out
    500000000 digits found in 7 seconds
    [root~/C] gcc -O3 test.c
    [root~/C] ./a.out
    500000000 digits found in 4 seconds
    [root~/C] gcc -O3 test.c
    [root~/C] ./a.out
    1797783552 digits found in 2 seconds
    [root~/C] gcc -O3 test.c
    [root~/C] ./a.out
    500000000 digits found in 6 seconds


    The oddball here is produced by isdigit() !!!! ????

    [edit] ah -- isdigit returns 2048, not 1 (the same for all 10 digits)...which implies.....
    Last edited by MK27; 06-10-2009 at 10:10 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. rand() implementation
    By habert79 in forum C Programming
    Replies: 4
    Last Post: 02-07-2009, 01:18 PM
  2. implementation file
    By bejiz in forum C++ Programming
    Replies: 5
    Last Post: 11-28-2005, 01:59 AM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. #include cctype and the isdigit
    By nextus in forum C++ Programming
    Replies: 2
    Last Post: 12-26-2002, 07:55 PM
  5. Correct use of isdigit
    By DocDroopy in forum C Programming
    Replies: 3
    Last Post: 08-05-2002, 07:22 AM