Thread: Here I go again...arrays....char...

  1. #31
    Sometimes so stupid... shardin's Avatar
    Join Date
    Jul 2007
    Location
    Dalmatia/CRO
    Posts
    78
    If you're not allowed to use functions from <ctype.h>, then . . . oh well. That would be ridiculous.
    Well, I'm not allowed. No, not on this exam. And no computer, we have to write it all on paper.
    ...and aprentice shall become master...or not...

    "Never let your sense of moral prevent you from doing what is right!" Salvor Hardin, mayor of Terminus

  2. #32
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    As I said, that's ridiculous. This is non-standard:
    Code:
    if(c >= 'a' && c <= 'z')
    It's unportable because not all character sets have 'b' after 'a', etc. The only time you can use a range like that is between '0' and '9', which are guaranteed to be contiguous.

    The portable way to do things is to use the functions in ctype.h.

    For proof, all you have to do it look around the internet. Here's one source, Wikipedia: http://en.wikipedia.org/wiki/Ctype.h
    Early toolsmiths writing in C under Unix began developing idioms at a rapid rate to classify characters into different types. For example, in the ASCII character set, the following test identifies a letter:

    Code:
    if ('A' <= c && c <= 'Z' || 'a' <= c && c <= 'z')
    However, this idiom does not work for other character sets such as EBCDIC.
    I would seriously consider lodging a complaint to try to get them to let you use ctype.h stuff. There aren't any functions in there that can't be emulated with an if statement in an unportable, ASCII-dependent manner.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  3. #33
    Sometimes so stupid... shardin's Avatar
    Join Date
    Jul 2007
    Location
    Dalmatia/CRO
    Posts
    78
    But it works fine with, "c>=65 && c<=122", those are letters from a -A to z-Z. That is acceptable.
    ...and aprentice shall become master...or not...

    "Never let your sense of moral prevent you from doing what is right!" Salvor Hardin, mayor of Terminus

  4. #34
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    No. Absolutely not. Did you read the article I linked to? It's completely unacceptable if you want your program to be portable. Sure, it works for ASCII. But it won't work for EBCDIC. It might not work for other character sets. It's simply not portable.

    You can't assume on 'a'...'z' and 'A'...'Z' being contiguous.
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  5. #35
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by dwks View Post
    No. Absolutely not. Did you read the article I linked to? It's completely unacceptable if you want your program to be portable. Sure, it works for ASCII. But it won't work for EBCDIC. It might not work for other character sets. It's simply not portable.

    You can't assume on 'a'...'z' and 'A'...'Z' being contiguous.
    And of course, we all have a few EBCDIC machines sitting around in our homes :-)

    Are AS/400 machines using EBCDIC, or are they using ASCII/UNICODE?

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #36
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Well I think the real point is not that we all might have some EBCDIC keyboards at home. It's more about not having to punch in some slightly mysterious-at-first-glance number code to talk about letters. Being able to refer to them reliably despite their order according to charmap.exe is simply a pleasant bonus that is always rewarded. Character constants make things easiest for us, the people.

  7. #37
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by citizen View Post
    Well I think the real point is not that we all might have some EBCDIC keyboards at home. It's more about not having to punch in some slightly mysterious-at-first-glance number code to talk about letters. Being able to refer to them reliably despite their order according to charmap.exe is simply a pleasant bonus that is always rewarded. Character constants make things easiest for us, the people.
    Oh, sure, I agree that the above code should be (at least):
    Code:
    if (x >= 'A' && x < 'z')
    Better yet:
    Code:
    if (isalpha(x))
    as the latter doesn't include [\] and `.

    But the fact is that 99.9% of all machines around today are using ASCII for the English parts of the characters - other letters outside that range isn't quite so easy.

    The same applies to the constant "that won't work if the machine uses 1's complement for negative numbers". I've worked with computers for over 20 years and yet to see a machine that uses 1's complement. Yes, fine, there's machines that have been produced that uses different numerical formats, and if you wish to write code that is really portable, do so.

    There's portable, and there is portable. Assuming that a machine runs with a ASCII (or closely enough related that the difference appears outside the English character set) character set is reasonable in all but very rare circumstances.

    If you want portable code, the original code isn't internationalized either, so it wouldn't work in Swedish for example, since ÅÄÖ (and åäö of lower-case versions of course) are considered vowels in Swedish. Not sure if ü in German counts too. Certainly, French accented letters such as é and è or other decorated letters, e.g. ë would probably also count as vowels in their languages. So complaining about ONE part of a piece of code not being portable, when the rest of it is also utterly un-portable is pretty pointless.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Obtaining source & destination IP,details of ICMP Header & each of field of it ???
    By cromologic in forum Networking/Device Communication
    Replies: 1
    Last Post: 04-29-2006, 02:49 PM
  2. Half-life SDK, where are the constants?
    By bennyandthejets in forum Game Programming
    Replies: 29
    Last Post: 08-25-2003, 11:58 AM
  3. comparing fields in a text file
    By darfader in forum C Programming
    Replies: 9
    Last Post: 08-22-2003, 08:21 AM
  4. String sorthing, file opening and saving.
    By j0hnb in forum C Programming
    Replies: 9
    Last Post: 01-23-2003, 01:18 AM