Thread: Printing Unicode to console

  1. #1
    Registered User
    Join Date
    Jan 2008
    Posts
    65

    Printing Unicode to console

    Code:
    #include <iostream>
    #include <string>
    
    int main() {
        std::cout << "こんにちは\n";
        std::cin.get();
    }
    I'm using Visual Studio. I tried running this program. It's supposed to print some Japanese characters, but all I see is ????. How do I get the characters to display properly?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    http://msdn.microsoft.com/en-us/libr...73(VS.85).aspx
    Set an appropriate code page perhaps?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    In Windows, the only way to get Unicode characters (above 0x007F) to display on the console is to use WriteConsoleW(). Your console's font also needs to contain glyphs for the Unicode characters you're attempting to print.
    Code:
    #define _CRT_SECURE_NO_WARNINGS
    #include <windows.h>
    #include <iostream>
    #include <locale>
    #include <mbctype.h>
    using namespace std;
    
    #define SOURCE_IN_CP932
    
    // If using VC++, VS 2005 was the first to work with Unicode source code
    #if !defined(SOURCE_IN_CP932) && defined(_MSC_VER) && (_MSC_VER < 1400)
    #   error "Forget it! Time to upgrade"
    #endif
    
    #if defined(SOURCE_IN_CP932) && defined(__GNUC__)
    #   error "Stick with UTF8 source with MinGW"
    #endif
    
    int main()
    {
        wchar_t wmsg[32];
        size_t len;
    
    #ifdef SOURCE_IN_CP932
        const char msg[] = "こんにちは\n"; // save source file in CP 932
        cout << "sizeof(msg) = " << sizeof(msg) << endl;
    
        // convert CP 932 -> Unicode (UTF16LE on Windows)
        setlocale(LC_CTYPE, ".932");
        len = mbstowcs(wmsg, msg, sizeof(msg)/sizeof(*msg));
    #else
        // source already saved as Unicode, just copy the characters
        wcscpy(wmsg, L"こんにちは\n"); // save source file as Unicoode
        len = wcslen(wmsg);
    #endif
    
        if (wmsg[0] != 0x3053)
        {
            cerr << "Bad conversion detected, first character not U+3053" << endl;
    #ifdef SOURCE_IN_CP932
            cerr << "Source code must be saved under CP 932." << endl;
    #else
            cerr << "Source code must be saved using a Unicode encoding." << endl;
    #endif
            return 1;
        }//if
    
        DWORD written;
        if (!WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), 
                           wmsg, (DWORD)len,
                           &written, 0))
            cerr << "WriteConsole failed, le = " << GetLastError() << endl;
    
        return 0;
    }//main
    gg

  4. #4
    Registered User
    Join Date
    Apr 2007
    Posts
    137
    Quote Originally Posted by Codeplug View Post
    In Windows, the only way to get Unicode characters (above 0x007F) to display on the console is to use WriteConsoleW().
    No, there are several ways (cf G. G.)...

  5. #5
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> No, there are several ways ...
    And yet you fail to provide any other ways...are we to just take your word for it?

    gg

  6. #6
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> (above 0x007F)
    That part of my statement is erroneous. Any visible characters in the console that didn't go through WriteConsoleW() are "code page characters" (for lack of a better term). Meaning the glyph that was displayed came from a lookup into a code page table.

    So if your source code is saved as Shift-JIS (code page 932), you could call SetConsoleOutputCP(932) and WriteConsoleA(msg). But Unicode isn't involved at this point - 'msg' is just a char string with a MBCS of CP 932. (Internally, MS uses the current CP to convert it into Unicode then calls the W version. Most, if not all, of the MS "ANSI" API's work this way.)

    gg

  7. #7
    Registered User
    Join Date
    Jan 2008
    Posts
    65
    Thanks. I still see question marks because my font doesn't support these characters. Do you know of a font that does? I've tried adding several Japanese fonts that Windows came with to the command window as per http://support.microsoft.com/default...b;en-us;247815

    However, they don't show up in the font list so I'm assuming they don't meet the criteria or I'm adding them to the registry incorrectly.

  8. #8
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Consolas is the only other font I've gotten to work with cmd.exe.

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Moving the Console Cursor & Printing In Color
    By Adak in forum C Programming
    Replies: 0
    Last Post: 03-09-2009, 11:31 AM
  2. Printing on the console
    By balu14u in forum C Programming
    Replies: 3
    Last Post: 04-02-2005, 11:40 PM
  3. Printing in Linux console (Epson LQ-300+)
    By zahid in forum Linux Programming
    Replies: 0
    Last Post: 02-17-2003, 11:13 PM
  4. printing non-ASCII characters (in unicode)
    By dbaryl in forum C Programming
    Replies: 1
    Last Post: 10-25-2002, 01:00 PM
  5. UNICODE and GET_STATE
    By Registered in forum C++ Programming
    Replies: 1
    Last Post: 07-15-2002, 03:23 PM