Thread: how is it possible to write/read in language other than english in a console app?

  1. #1
    بابلی ریکا Masterx's Avatar
    Join Date
    Nov 2007
    Location
    Somewhere nearby,Who Cares?
    Posts
    497

    how is it possible to write/read in language other than english in a console app?

    hello, sorry to bother you again, ive heard sth about Unicode standard thats been being implemented in C++ standard !

    so ive seen console applications that uses other languages other than English for representing data !
    how is it possible ? is there any kind of class or header file that eases the use of such works in console projects ? or i should myself do sth about it !
    if i am the one who has to do sth , what are those thing i should do to be able to use custom language in an app !
    tanx

  2. #2
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    You just need to use unicode, wchar_t instead of char.
    Also note that that every unicode string literals has a capital L in front of it. Example L"Hello World".
    Unicode also has their own C library equivalent functions. For example, wcscpy instead of strcpy.

    Visual Studio also provides its own "TCHAR" routines. TCHAR is a typedef for either char or wchar_t depending on settings.
    Again, TCHAR has their own functions, such as tcscpy (I believe) instead of strcpy or wcscpy.
    All TCHAR strings are enclosed in _T(). Example: _T("Hello World").
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  3. #3
    بابلی ریکا Masterx's Avatar
    Join Date
    Nov 2007
    Location
    Somewhere nearby,Who Cares?
    Posts
    497
    Quote Originally Posted by Elysia View Post
    You just need to use unicode, wchar_t instead of char.
    Also note that that every unicode string literals has a capital L in front of it. Example L"Hello World".
    Unicode also has their own C library equivalent functions. For example, wcscpy instead of strcpy.

    Visual Studio also provides its own "TCHAR" routines. TCHAR is a typedef for either char or wchar_t depending on settings.
    Again, TCHAR has their own functions, such as tcscpy (I believe) instead of strcpy or wcscpy.
    All TCHAR strings are enclosed in _T(). Example: _T("Hello World").
    tanx Elysia ,
    so by simply replacing type char with wchar_t its done ? !
    is it implemented in Iostream header ?

  4. #4
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    wchar_t is a built-in type, so you don't need to include anything.
    TCHAR typically resides in tchar.h.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #5
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    If you're looking at doing real-time multilingualism on the command line, I'd look into a package called gettext. it is originally a unix program, but has been ported to windows, and while I'm not familiar with its use, I have seen it in action, and it appears to work very well.

  6. #6
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Quote Originally Posted by Masterx View Post
    tanx Elysia ,
    so by simply replacing type char with wchar_t its done ? !
    is it implemented in Iostream header ?
    Yes, but if you're using wchar_t strings instead of char strings, you need to use the 'w' version of everything.
    Ex. std::wstring, std::wcout, std::wcin, std::wfstream...
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  7. #7
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    The current C++ '03 standard library does not support Unicode output. wcout and wfstream with both perform a wchar_t to char conversion before output. This behavior can be changed using a custom codecvt facet. However, Unicode output still doesn't work on Windows with the MS CRT. For Windows, the only way to get UTF16 console Unicode output is to use WriteConsoleW(). So you could create a custom wstreambuf which uses WriteConsoleW() as well.

    gg

  8. #8
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    In addition, wchar_t is a horribly muddled type that only brings portability problems. At the same time, char strings are a mess of encodings. Writing international software in C++ invariably involves stuff beyond the standard library.

    C++0x will bring a minor improvement in this regard.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  9. #9
    بابلی ریکا Masterx's Avatar
    Join Date
    Nov 2007
    Location
    Somewhere nearby,Who Cares?
    Posts
    497
    tanx all i was about to ask why this piece of code i provided does not work!
    Code:
    #include <iostream>
    #include <tchar.h>
    #include <windows.h>
    //#include <string>
    
    using namespace std;
    
    int main()
    {
        int i=0;
        char buf[200];
        wchar_t s[] = L"چشم";//a string consisting of 3 non-english characters meaning "Eye"!
        char c[] = "چشم";//equivalant to wchar_t s[] = L"eye"; and char c[] = "eye";
        for (int i=0;c[i]!='\0';i++)
        {
        cout<<c[i];
    
        }
        CharToOemW(s, buf); cout<<endl<<buf<<endl;
        return 0;
    }
    so im a little bit confused you know!
    first of let me tell you , im planing to code for diffrent platforms ! i mean planing to code Cross-platform programs , so thats why i just need to stick to the standard C\C++ . i would be very thankful if you Dears contribute solutions in a way that serve the purpose specified . and the solutions are not limited to the specific Platforms . Thanks in advance. (by the way ive already used windows specific solution " CharToOemW()" just to see how it works! but unfortunately this didnt even work! )

    and now second , well
    the only way to get UTF16 console Unicode output is to use WriteConsoleW(). So you could create a custom wstreambuf which uses WriteConsoleW() as well.
    if i am going to implement such a thing in Windows, how am i supposed to do that ? i mean how should i use this WriteConsoleW() stuff !!? (any sample code to show how it is implemented ?)

    and
    Yes, but if you're using wchar_t strings instead of char strings, you need to use the 'w' version of everything.
    Ex. std::wstring, std::wcout, std::wcin, std::wfstream...
    and this doesnt need to add anything to the program ? such as some kinda headers or i dont now , any kind of define or typedef stuff?
    (im asking this question beacuse meanwhile i can not test the code to see whether it works or not , thats why im asking you such a thing! sorry for that !)

    If you're looking at doing real-time multilingualism on the command line, I'd look into a package called gettext. it is originally a unix program, but has been ported to windows, and while I'm not familiar with its use, I have seen it in action, and it appears to work very well.
    isnt it supposed to have real time multilingualism when going through such stages ? so what kind of multilingual program my code would be after all?( after using such instructions in the program? !)

    and whats that !? a thirdparty app i should use to gain such a feature under windows ? ! its the only way ? if so , i wonder what are those unicode stuff being implemented in C\C++ standard so far!?

    and to finish this post , ive heard that if i can change the default font of console i may be able to gain multilingualism! ist it true ? and if so how to do that !

    Many thanx to you all , i really appreciate your valuable comments and solutions tanx
    Last edited by Masterx; 10-09-2008 at 10:55 AM.

  10. #10
    بابلی ریکا Masterx's Avatar
    Join Date
    Nov 2007
    Location
    Somewhere nearby,Who Cares?
    Posts
    497
    edited !
    Yes, but if you're using wchar_t strings instead of char strings, you need to use the 'w' version of everything.
    Ex. std::wstring, std::wcout, std::wcin, std::wfstream...
    didnt work!? did i miss any thing ? !!
    Code:
    #include <iostream>
    #include <tchar.h>
    #include <windows.h>
    //#include <string>
    
    using namespace std;
    using std::wcout;
    int main()
    {
        int i=0;
        char buf[200];
        wchar_t s[] = L"چشم";//a string consisting of 3 non-english characters meaning "Eye"!
        char c[] = "چشم";//equivalant to wchar_t s[] = L"eye"; and char c[] = "eye";
        for (int i=0;c[i]!='\0';i++)
        {
        wcout<<c[i];
    
        }
        CharToOemW(s, buf); cout<<endl<<buf<<endl;
        return 0;
    }
    Last edited by Masterx; 10-09-2008 at 10:49 AM.

  11. #11
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> how am i supposed to do that ? i mean how should i use this WriteConsoleW()
    The online reference for Windows API's/SDK's is MSDN: http://msdn.microsoft.com/en-us/library/ms687401.aspx You use GetStdHandle() to get the first parameter for WriteConsoleW().

    >> ive heard that if i can change the default font of console i may be able to gain multilingualism! ist it true ? and if so how to do that !
    The font used by the console (cmd.exe) needs to support Unicode. "Raster Fonts" does not support Unicode, "Lucida Console" does. http://commandwindows.com/configure.htm

    >> didnt work!? did i miss any thing ? !!
    As I mentioned in my previous post, wcout and wfstream both perform a wchar_t to char conversion before output.

    Here are some things you need to be aware of if you want to write Unicode-aware code and make cross-platform.
    Unicode
    Part of the Unicode standard is to provide a mapping of character glyphs to a unique integer value. At present, you need at lease a 32bit integer to contain the largest Unicode character value.
    When writing source code to deal with Unicode, you need to be aware of how the Unicode characters are encoded. There are several encoding schemes including: UTF8, UTF16LE, UTF16BE, UTF32LE, UTF32BE. The LE and BE versions stand for little-endian/big-endian.

    wchar_t
    This is the character type used by the standard for doing wide stuff. The standard doesn't require that all implementations make wchar_t a particular size. Which also means that wide string litterals (wich map to wchar_t's) are also implementation defined.

    Windows compilers use 16bit wchar_t's and encode wide string literals using UTF16LE.

    GCC on *nix uses 32bit wchar_t's and encodes wide string literals using UTF32. The architecture determines if it's LE or BE.

    >> wchar_t s[] = L"چشم";
    There are a few problems with doing something like this. First, you need to make sure your compiler even supports the compilation of source files saved as Unicode. Most modern ones do. VC 6.0 does not. Next you have to make sure your editor saves the file in some Unicode format (that the compiler supports).

    >> char c[] = "چشم";
    This shouldn't even compile. If it does, then the editor saved the file as ASCII and the character values within the c array are meaningless.

    So as you can see, there are many things to consider if you truly hope to write cross-platform code that handles Unicode encodings.

    gg

  12. #12
    بابلی ریکا Masterx's Avatar
    Join Date
    Nov 2007
    Location
    Somewhere nearby,Who Cares?
    Posts
    497
    Quote Originally Posted by Codeplug View Post
    >> how am i supposed to do that ? i mean how should i use this WriteConsoleW()
    The online reference for Windows API's/SDK's is MSDN: http://msdn.microsoft.com/en-us/library/ms687401.aspx You use GetStdHandle() to get the first parameter for WriteConsoleW().

    >> ive heard that if i can change the default font of console i may be able to gain multilingualism! ist it true ? and if so how to do that !
    The font used by the console (cmd.exe) needs to support Unicode. "Raster Fonts" does not support Unicode, "Lucida Console" does. http://commandwindows.com/configure.htm

    >> didnt work!? did i miss any thing ? !!
    As I mentioned in my previous post, wcout and wfstream both perform a wchar_t to char conversion before output.

    Here are some things you need to be aware of if you want to write Unicode-aware code and make cross-platform.
    Unicode
    Part of the Unicode standard is to provide a mapping of character glyphs to a unique integer value. At present, you need at lease a 32bit integer to contain the largest Unicode character value.
    When writing source code to deal with Unicode, you need to be aware of how the Unicode characters are encoded. There are several encoding schemes including: UTF8, UTF16LE, UTF16BE, UTF32LE, UTF32BE. The LE and BE versions stand for little-endian/big-endian.

    wchar_t
    This is the character type used by the standard for doing wide stuff. The standard doesn't require that all implementations make wchar_t a particular size. Which also means that wide string litterals (wich map to wchar_t's) are also implementation defined.

    Windows compilers use 16bit wchar_t's and encode wide string literals using UTF16LE.

    GCC on *nix uses 32bit wchar_t's and encodes wide string literals using UTF32. The architecture determines if it's LE or BE.

    >> wchar_t s[] = L"چشم";
    There are a few problems with doing something like this. First, you need to make sure your compiler even supports the compilation of source files saved as Unicode. Most modern ones do. VC 6.0 does not. Next you have to make sure your editor saves the file in some Unicode format (that the compiler supports).

    >> char c[] = "چشم";
    This shouldn't even compile. If it does, then the editor saved the file as ASCII and the character values within the c array are meaningless.

    So as you can see, there are many things to consider if you truly hope to write cross-platform code that handles Unicode encodings.

    gg
    thanks a million CodePlug i really dont know how to thank you. Great Stuff, i really appreciate it ,
    and about compiler. i use Codeblocks IDE and MingW Gnu gcc compiler.
    well dont know if it supports the feature or not! gotta ask .

    so at last ,i would be able to run an app when i get my compiler fixed! ok

    ill be informing you if i face any problem ,
    many many tanx again

  13. #13
    بابلی ریکا Masterx's Avatar
    Join Date
    Nov 2007
    Location
    Somewhere nearby,Who Cares?
    Posts
    497
    hello , im back again .
    i gave it a try and on gcc i kept getting an error stating " wcout is not a member of std!" for a simple wchar-t use~! so first of all ,is it the gcc problem that does not support such a thing!?
    second , as it seems ive gotta do sth like this , implement a function in my source code , that enables me to switch between unix and windows Operating systems!
    this function in complition time should find out if it is a windows platform or not! and if it finds it windows ,voila .! and now i can stick to the WriteConsoleW() , and get teh job done under windows ~ but still i have no other solution for linux (because gcc seems to have a problem !there!)
    how to do this ? !

  14. #14
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    This is a library issue, not a gcc issue. wcout doesn't exist on my MinGW port, but does on my Mac.

  15. #15
    chococoder
    Join Date
    Nov 2004
    Posts
    515
    never had a problem in producing Dutch console output, wide characters required.
    Same is true for Italian, French, Spanish, German, etc. etc. if you're willing to put up with loosing a few special characters (which has long been accepted way back in the days of typewriters).

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Simple Novice Question: MS DOS app. a console?
    By renurv in forum C++ Programming
    Replies: 7
    Last Post: 12-30-2005, 02:42 PM
  2. Win32 Console app Problems...
    By Junior89 in forum C++ Programming
    Replies: 3
    Last Post: 01-20-2005, 05:17 PM
  3. English as a second language
    By ober in forum A Brief History of Cprogramming.com
    Replies: 12
    Last Post: 04-09-2004, 07:29 AM
  4. Languages dying
    By Zewu in forum A Brief History of Cprogramming.com
    Replies: 31
    Last Post: 07-29-2003, 10:08 AM
  5. The devaluation of the English (or American) language
    By Aran in forum A Brief History of Cprogramming.com
    Replies: 36
    Last Post: 09-03-2001, 02:12 PM