Thread: Extended character set and the console

  1. #1
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446

    Extended character set and the console

    I'm on the early stages of a Rogue/MUD mix type of game for the console. I'm trying to save the '\xA9' symbol into a char and have it displayed. I6t's the copyright symbol I'm going to use to represent any item of type container.

    However, that code ends up producing another symbol. I can't even find it on the ASCII table or the ISO-8859-1 (windows-1252) character set. The symbol shown, which I can't reproduce here, is the mirror image of the decimal 170 symbol found under the extended ASCII Table.

    I can portrait the copyright symbol on my text editors and the C++ IDE. No worries there. I just can't seem to find a way to have it displayed after building my app.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  2. #2
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    AFAIK, the copyright symbol is #169. Seems to work.
    [edit] Should probably write the C++ version though.
    Code:
    Owner@PAVILION ~
    $ cat copy.c
    #include <iostream>
    
    int main(void) {
       std::cout.put(169);
       return 0;
    }
    Owner@PAVILION ~
    $ ./copy
    &#169;
    Last edited by whiteflags; 07-22-2006 at 11:19 AM.

  3. #3
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    hmm... must be something with my console then.

    I found probably what is wrong here:
    http://www.evergreen.edu/biophysics/...cii_ext-pc.htm

    The symbol I'm getting displayed is the one to the left of the copyright symbol (decimal 169) under the DOS column.

    So it seems my console is displaying the DOS character set instead of the windows one. Question is now, why and how I change it.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  4. #4
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,613
    I see. Well good news and bad news. Running your program on any other shell would probably make it work (I ran the code on msys, which doesn't use the DOS character set). The windows console seems to still use the DOS character set.

    The information from the link is correct, in a DOS console, if you type <Alt> 0169 c appears, which is probably just the terminal's best way to represent the copyright symbol. The fix I recommend is just typing (c) in place of ©, unless you want to port your console app to Unicode, which would definitely work.

  5. #5
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Ah! Unicode...

    Guess any time is a good time to start coding with Unicode.

    Thanks citizen.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  6. #6
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    You've just learned a valuable lesson: standard C++ does not specify the program character set/encoding: not that of the source code, not that of string literals, not that of incoming input. It could be ISO-8859-1, ISO-8859-15, Windows-1252, or even MacRoman, EBCDIC, or UTF-8. It might change without your program being recompiled. (The standard does not specify a character encoding conversion for string input, and consoles on Linux may have different character encodings as a configuration option.)
    Basically, you need to find out the character set at runtime (no standard way to do that) or just convert everything to wchar_t and hope it's what you expect it to be. Or assume UTF-8 and convert everything incoming. Locales are your friend here.
    But never ever make assumptions about which numeric value corresponds to which character. Don't even assume that one char means on character.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

Popular pages Recent additions subscribe to a feed