Thread: A faster ASCII table

  1. #1
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545

    A faster ASCII table

    I just came up with a faster version of ASCII on the subway this morning...
    Since string parsing usually does a lot of changing case or case-insensitive comparisons, wouldn't it have been better if small and capital letters would have been exactly the same bits except for something like the highest bit? That way if you want to convert between upper & lower case, all you need to do is flip a bit instead of having an if statement check if the char value is > x && < y...
    I know it probably wouldn't make a huge performance impact, but every little bit helps.
    Now I just need a DeLorean so I can go back in time and change the ASCII table...
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by cpjust View Post
    I just came up with a faster version of ASCII on the subway this morning...
    Since string parsing usually does a lot of changing case or case-insensitive comparisons, wouldn't it have been better if small and capital letters would have been exactly the same bits except for something like the highest bit? That way if you want to convert between upper & lower case, all you need to do is flip a bit instead of having an if statement check if the char value is > x && < y...
    I know it probably wouldn't make a huge performance impact, but every little bit helps.
    Now I just need a DeLorean so I can go back in time and change the ASCII table...
    Is this some sort of obscure joke? (Upper and lowercase letters differ by exactly one bit in ASCII.)

  3. #3
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Code:
    unsigned char FlipCase( unsigned char ch )
    {
        return ch ^ 0x20;
    }
    Already works.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  4. #4
    Registered User jdragyn's Avatar
    Join Date
    Sep 2009
    Posts
    96
    Apparently you already went back in time.
    C+/- programmer extraordinaire

  5. #5
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    Perhaps he means... "even faster."

  6. #6
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    I actually didn't ever noticed they differed by one bit. Cool!
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  7. #7
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Mario F. View Post
    I actually didn't ever noticed they differed by one bit. Cool!
    Me either. I just thot it was neat that they differ by exactly 32, which is the value of the SPACE!!! Is that really 0x20 in hex????!?

    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  8. #8
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    Shrug...

    Actually, ASCII has been tremendously well thought of IMHO. Not just that, but it has all kinds of patterns when you look closely. You know, for instance, that for digits 0-9, the least significant 4 bits represents the value? And, yes, that matching characters usually differ by only one bit.

    I wish unicode was like that, though, where you can easily determine if a character has an upper/lower case variant and convert it in one bit. The latter is probably usually possible, but the former, afaik, requires lookup tables.

  9. #9
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    EBCDIC does this too. The X'40' bit (aka, the blank).
    Mainframe assembler programmer by trade. C coder when I can.

  10. #10
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by brewbuck View Post
    Code:
    unsigned char FlipCase( unsigned char ch )
    {
        return ch ^ 0x20;
    }
    Already works.
    Sure, but what if the character isn't a letter? If it's a number or punctuation, that won't work.
    I was thinking of something like 0 - 127 being the same as 128 - 255 except for upper case letters. That way you won't have to look at whether the char is a letter or something else, you just flip the highest bit.
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  11. #11
    Registered User
    Join Date
    Jan 2010
    Posts
    412
    Quote Originally Posted by cpjust View Post
    Sure, but what if the character isn't a letter? If it's a number or punctuation, that won't work.
    I was thinking of something like 0 - 127 being the same as 128 - 255 except for upper case letters. That way you won't have to look at whether the char is a letter or something else, you just flip the highest bit.
    And with this change you've just reduced the ascii table from 256 characters to 154

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by _Mike View Post
    And with this change you've just reduced the ascii table from 256 characters to 154
    The ASCII set was only 128 characters to begin with.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  13. #13
    Registered User
    Join Date
    Jan 2010
    Posts
    412
    Quote Originally Posted by brewbuck View Post
    The ASCII set was only 128 characters to begin with.
    Yes but I assumed he was refering to the 8 bit extended table. His suggestion would not even fit in a 7 bit table.

  14. #14
    Registered User
    Join Date
    Jan 2009
    Posts
    1,485
    That is a neat function brewbuck, I'll use that.

  15. #15
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by _Mike View Post
    And with this change you've just reduced the ascii table from 256 characters to 154
    Where did the 154 come from? 256 / 2 = 128.

    Quote Originally Posted by _Mike View Post
    Yes but I assumed he was refering to the 8 bit extended table. His suggestion would not even fit in a 7 bit table.
    I was talking about an 8-bit byte, of which ASCII only defines 7 bits, so you've already got a free bit to play with, so why not make use of it?
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Please help me as fast as possible
    By Xbox999 in forum C Programming
    Replies: 5
    Last Post: 11-30-2009, 06:53 PM
  2. help with structs and malloc!
    By coni in forum C Programming
    Replies: 20
    Last Post: 09-14-2009, 05:38 PM
  3. Writing array, to file
    By zootreeves in forum C Programming
    Replies: 9
    Last Post: 09-08-2007, 05:06 PM
  4. ASCII Table
    By peckitt99 in forum C Programming
    Replies: 21
    Last Post: 10-09-2006, 01:53 AM
  5. ASCII table going crazy?
    By Jamsan in forum Windows Programming
    Replies: 19
    Last Post: 03-27-2003, 02:33 AM