Here I go again...arrays....char...

**shardin** · 09-26-2007

If you're not allowed to use functions from <ctype.h>, then . . . oh well. That would be ridiculous.

Well, I'm not allowed. No, not on this exam. And no computer, we have to write it all on paper.

**dwks** · 09-26-2007

As I said, that's ridiculous. This is non-standard:

Code:

if(c >= 'a' && c <= 'z')

It's unportable because not all character sets have 'b' after 'a', etc. The only time you can use a range like that is between '0' and '9', which are guaranteed to be contiguous.

The portable way to do things is to use the functions in ctype.h.

For proof, all you have to do it look around the internet. Here's one source, Wikipedia: http://en.wikipedia.org/wiki/Ctype.h

Early toolsmiths writing in C under Unix began developing idioms at a rapid rate to classify characters into different types. For example, in the ASCII character set, the following test identifies a letter:

Code:

if ('A' <= c && c <= 'Z' || 'a' <= c && c <= 'z')

However, this idiom does not work for other character sets such as EBCDIC.

I would seriously consider lodging a complaint to try to get them to let you use ctype.h stuff. There aren't any functions in there that can't be emulated with an if statement in an unportable, ASCII-dependent manner.

**shardin** · 09-26-2007

But it works fine with, "c>=65 && c<=122", those are letters from a -A to z-Z. That is acceptable.

**dwks** · 09-26-2007

No. Absolutely not. Did you read the article I linked to? It's completely unacceptable if you want your program to be portable. Sure, it works for ASCII. But it won't work for EBCDIC. It might not work for other character sets. It's simply not portable.

You can't assume on 'a'...'z' and 'A'...'Z' being contiguous.

**matsp** · 09-26-2007

Originally Posted by dwks

No. Absolutely not. Did you read the article I linked to? It's completely unacceptable if you want your program to be portable. Sure, it works for ASCII. But it won't work for EBCDIC. It might not work for other character sets. It's simply not portable.

You can't assume on 'a'...'z' and 'A'...'Z' being contiguous.

And of course, we all have a few EBCDIC machines sitting around in our homes :-)

Are AS/400 machines using EBCDIC, or are they using ASCII/UNICODE?

--
Mats

**whiteflags** · 09-26-2007

Well I think the real point is not that we all might have some EBCDIC keyboards at home. It's more about not having to punch in some slightly mysterious-at-first-glance number code to talk about letters. Being able to refer to them reliably despite their order according to charmap.exe is simply a pleasant bonus that is always rewarded. Character constants make things easiest for us, the people.

**matsp** · 09-26-2007

Originally Posted by citizen

Well I think the real point is not that we all might have some EBCDIC keyboards at home. It's more about not having to punch in some slightly mysterious-at-first-glance number code to talk about letters. Being able to refer to them reliably despite their order according to charmap.exe is simply a pleasant bonus that is always rewarded. Character constants make things easiest for us, the people.

Oh, sure, I agree that the above code should be (at least):

Code:

if (x >= 'A' && x < 'z')

Better yet:

Code:

if (isalpha(x))

as the latter doesn't include [\] and `.

But the fact is that 99.9% of all machines around today are using ASCII for the English parts of the characters - other letters outside that range isn't quite so easy.

The same applies to the constant "that won't work if the machine uses 1's complement for negative numbers". I've worked with computers for over 20 years and yet to see a machine that uses 1's complement. Yes, fine, there's machines that have been produced that uses different numerical formats, and if you wish to write code that is really portable, do so.

There's portable, and there is portable. Assuming that a machine runs with a ASCII (or closely enough related that the difference appears outside the English character set) character set is reasonable in all but very rare circumstances.

If you want portable code, the original code isn't internationalized either, so it wouldn't work in Swedish for example, since ÅÄÖ (and åäö of lower-case versions of course) are considered vowels in Swedish. Not sure if ü in German counts too. Certainly, French accented letters such as é and è or other decorated letters, e.g. ë would probably also count as vowels in their languages. So complaining about ONE part of a piece of code not being portable, when the rest of it is also utterly un-portable is pretty pointless.

--
Mats

Thread: Here I go again...arrays....char...

Thread Tools

Search Thread

Display

Similar Threads

Obtaining source & destination IP,details of ICMP Header & each of field of it ???

Half-life SDK, where are the constants?

comparing fields in a text file

String sorthing, file opening and saving.