Thread: wprintf wchar, always get "?"

  1. #1
    Registered User
    Join Date
    Mar 2012
    Posts
    9

    wprintf wchar, always get "?"

    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <locale.h>
    #include <wchar.h>
    
    int main(int argc, char** argv)
    {
        /* print parameters */
        wchar_t wc=0xbbbb;
        //setlocale(LC_CTYPE, "en_US.utf8");
        //setlocale(LC_CTYPE, "");
        setlocale(LC_ALL,"");
        wprintf(L"%c\n",wc);
        return 0;
    }
    I run this code in a command prompt, and always get "?" for any value bigger than 7f. Even though the command prompt code page may not match with the default one used by the code (I don't knowfor sure what is it), I expect to see a least some char other than ?. Did I use wrong print func?

    thx,

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Why do you want to use wchar types? If it is for output, I think you are better off using unicode notation under C99:

    Code:
    	char s[] = "\ubbbb";
    	puts(s);
    The use of wchar_t by the standard is very problematic, qv:

    Wide character - Wikipedia, the free encyclopedia
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    It might help to pick a valid unicode character.

    Then read up on the differences of wchar_t formatting for narrow/wide chars/strings.
    Eg.
    Code:
    $ cat bar.c
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <locale.h>
    #include <wchar.h>
    
    int main()
    {
        /* print parameters */
        wchar_t wc=0x2190;
        //setlocale(LC_CTYPE, "en_US.utf8");
        //setlocale(LC_CTYPE, "");
        setlocale(LC_ALL,"");
        wprintf(L"%lc\n",wc);
        return 0;
    }
    $ gcc -Wall -Wextra -std=c99 bar.c
    $ ./a.out 
    ←
    You also need a reasonably up to date OS/Compiler as well.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Salem View Post
    It might help to pick a valid unicode character.
    0xbbbb is a valid unicode character. 뮻

    Unicode/UTF-8-character table - starting from code position BB00
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  5. #5

  6. #6
    Registered User
    Join Date
    Mar 2012
    Posts
    9
    Quote Originally Posted by Salem View Post
    It might help to pick a valid unicode character.

    Then read up on the differences of wchar_t formatting for narrow/wide chars/strings.
    Eg.
    Code:
    $ cat bar.c
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <locale.h>
    #include <wchar.h>
    
    int main()
    {
        /* print parameters */
        wchar_t wc=0x2190;
        //setlocale(LC_CTYPE, "en_US.utf8");
        //setlocale(LC_CTYPE, "");
        setlocale(LC_ALL,"");
        wprintf(L"%lc\n",wc);
        return 0;
    }
    $ gcc -Wall -Wextra -std=c99 bar.c
    $ ./a.out 
    ←
    You also need a reasonably up to date OS/Compiler as well.
    Thanks a lot for input. I realized that the value of 0x2190 is taken as unicode code point. wprintf tanslates it into utf-8 hex, and then sends it into the stdout, E6 86 90. However, it looks like my Linux console code page is not utf-8, so instead of an arrow, it prints out
    â

    How to correct this, I mean, forcing the console to use utf-8 coding?

  7. #7
    Registered User
    Join Date
    Mar 2012
    Posts
    9
    Quote Originally Posted by Salem View Post
    It might help to pick a valid unicode character.

    Then read up on the differences of wchar_t formatting for narrow/wide chars/strings.
    Eg.
    Code:
    $ cat bar.c
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <locale.h>
    #include <wchar.h>
    
    int main()
    {
        /* print parameters */
        wchar_t wc=0x2190;
        //setlocale(LC_CTYPE, "en_US.utf8");
        //setlocale(LC_CTYPE, "");
        setlocale(LC_ALL,"");
        wprintf(L"%lc\n",wc);
        return 0;
    }
    $ gcc -Wall -Wextra -std=c99 bar.c
    $ ./a.out 
    ←
    You also need a reasonably up to date OS/Compiler as well.
    Thanks a lot for your input to let me see the dim light through the long long dark tunnel. However, in my test on a Linux box, I got output of
    â instead of an arrow.

    I checked the code table of utf-8, and figured out why. First, it looks like the program interprets 0x2190 as code point instead of utf-8 hex. Then, the program prints utf-8 hex code to the stdout according to this code point, which is E2 86 90. However, the Linux console interprets the input as code point, thus I got â since its code point is E2. The code points of 86 or 90 are not displayable.

    Ah, I'm a bit confused. Why I got this different result. Anything wrong like configuration of the console?

  8. #8
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    locale(1): locale-specific info - Linux man page
    What do you get when you type "locale" on the command line?
    What OS/distribution are you using?

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 9
    Last Post: 03-31-2009, 04:23 PM
  2. Replies: 46
    Last Post: 08-24-2007, 04:52 PM
  3. "itoa"-"_itoa" , "inp"-"_inp", Why some functions have "
    By L.O.K. in forum Windows Programming
    Replies: 5
    Last Post: 12-08-2002, 08:25 AM
  4. "CWnd"-"HWnd","CBitmap"-"HBitmap"...., What is mean by "
    By L.O.K. in forum Windows Programming
    Replies: 2
    Last Post: 12-04-2002, 07:59 AM