Thread: probably a really simple question, how to insert £?

  1. #16
    Yes, my avatar is stolen anonytmouse's Avatar
    Join Date
    Dec 2002
    Posts
    2,544
    The problem is with the sign-extension and is easy to fix.
    1. Consider the char value -10.
    2. This is "11110110" in two's-complement binary.
    3. Now we implicitly convert this to a signed int by passing it to isprint().
    4. The value is sign-extended and we now have "11111111 11111111 11111111 11110110" in binary.
    5. The implementation casts this value to an unsigned int for comparison purposes.
    6. We now have the decimal value 4294967286. Obviously, this will cause problems if we try to compare it against a 256 or 64KB lookup table.

    This is demonstrated with this very simple program.
    Code:
    #include <stdio.h>
    
    void demo(int c)
    {
    	printf("int_value: %d\n", c);
    	printf("uint_value: %u\n", c);
    }
    
    int main(void)
    {
    	char c = -10;
    
    	demo(c);
    	demo((unsigned char) c);
    
    	getchar();
    	return 0;
    }
    The solution is to avoid the sign-extension by passing the value as an unsigned char.
    Code:
    isprint((unsigned char) 'ö')
    However, it still doesn't return true. This is because the ctype functions use the "C" locale by default. In this locale, only the base characters are supported. We can use the setlocale function or we can use Windows functions which provide much more reliable international character support.
    Code:
    #include <stdio.h>
    #include <ctype.h>
    #include <windows.h>
    #include <locale.h>
    
    int win_isprint(UCHAR c)
    {
    	WORD char_type;
    
    	if (GetStringTypeA(LOCALE_USER_DEFAULT, CT_CTYPE1, (LPCSTR) &c, 1, &char_type))
    		return !(char_type & C1_CNTRL);
    	else
    		return 0;
    }
    
    
    int main(void)
    {
    	char c = -10;
    	
    	setlocale( LC_ALL, ".ACP" ); /* Uses the windows specified code page. */
    
    	printf("CLib: %d\n", isprint((unsigned char) '\b'));
    	printf("CLib: %d\n", isprint((unsigned char) 'a'));
    	printf("CLib: %d\n", isprint((unsigned char) 'ö'));
    	printf("Win:  %d\n", win_isprint('ö'));
    	printf("Win:  %d\n", win_isprint('\b'));
    
    	getchar();
    	return 0;
    }
    Results:
    Code:
    CLib: 0
    CLib: 258
    CLib: 258
    Win:  1
    Win:  0
    Last edited by anonytmouse; 01-05-2005 at 04:45 PM.

  2. #17
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I know all of this. It's still counterintuitive. Especially as the same code works in C.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  3. #18
    Registered User
    Join Date
    Jan 2005
    Posts
    1
    I do not see what the big deal is:
    cout<<char(156);
    works fine for me in devc++ and vc++

  4. #19
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I what the big deal is:
    The bolded part:
    works fine for me.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  5. #20
    Self-Taught Noob
    Join Date
    Jan 2005
    Location
    Ohio
    Posts
    38
    just curious after reading through this post why couldnt you just go to the character map copy the symbol then paste it in your code like this...

    Code:
    cout << "£\n";
    No one died when Clinton lied.

    Compiler: Borland C++ Builder
    OS: Windows XP

  6. #21
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Because of this (simplified explanation): the compiler has a target character encoding. The literal strings inside the executable (including your "£\n") will be in this encoding. But the computer knows no such thing as a character in memory, it's all numbers - displaying a character just means using the number as an index into a table of drawing instructions. This is what an encoding does: it specifies which number to use for which character. This number will then get stored in the executable.

    There might be another encoding used on the target system. The same number stored in the executable could suddenly be a different character.

    This is what is so unreliable about the whole thing.


    Windows complicates matters: there is more than one encoding on a single platform. While the compiler will most likely use Windows' native Windows-1252 (or variant, depending on where you live) encoding as target, the interpretation of the characters, when written to the console, will happen using the IBM-OEM-xxx encoding, where xxx is your code page number. (443 is American English, 850 is German, ...)
    Try it yourself. Take that code snippet, compile it as a console app and run. It won't be a pound sign printed to the console.

    The problem is that you can't detect what character set is being used using standard C++. Your best bet, therefore, is to use wide-character strings. Using this code:
    wcout << L"£\n";
    should work. (Provided that you have a proper implementation of your standard streams. I'm not sure if there is such a thing.)
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Simple class question
    By 99atlantic in forum C++ Programming
    Replies: 6
    Last Post: 04-20-2005, 11:41 PM
  2. Simple question about pausing program
    By Noid in forum C Programming
    Replies: 14
    Last Post: 04-02-2005, 09:46 AM
  3. simple question.
    By InvariantLoop in forum Windows Programming
    Replies: 4
    Last Post: 01-31-2005, 12:15 PM
  4. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  5. simple fgets question
    By theweirdo in forum C Programming
    Replies: 7
    Last Post: 01-27-2002, 06:58 PM