Thread: Is possible to get traditional chinese chars from user

  1. #1
    Registered User
    Join Date
    Apr 2009
    Posts
    10

    Is possible to get traditional chinese chars from user

    Hi,

    1. Is possible to get a traditional chinese chars from user (input from keyboard)?

    2. Is number in unicode uses unsigned data type?

    3. Are there any difference in perform math calculation(add, subtract, etc) in
    unicode. Assume unicode uses unsigned data type.

    If there is difference, what should I do?

    4. How to check if I got a chinese unicode font in my computer.


    I am using WINXP Professional and a Borland C++ 6.0 Professional edition.

    Thanks
    Henry

  2. #2
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    I think the answer to your question is yes, but don't ask me for details...

    I do not think the signedness of the data type is very relevant. Keep in mind that a number is a number and the concept of "math using unicode" is a non-issue.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  3. #3
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    Quote Originally Posted by hen_ry View Post
    1. Is possible to get a traditional chinese chars from user (input from keyboard)?
    Yes, assuming the user has the means of entering them.

    Quote Originally Posted by hen_ry View Post
    2. Is number in unicode uses unsigned data type?
    Yes. There are no negative unicode code points, however the highest code point is 0x10FFFF, so a signed 32-bit integer will not have trouble. If you're dealing with UTF-16 code units, however, then it is 16-bit unsigned. Likewise for UTF-8.

    Quote Originally Posted by hen_ry View Post
    3. Are there any difference in perform math calculation(add, subtract, etc) in unicode. Assume unicode uses unsigned data type.
    Add? Subtract? These are characters. Add/subtract operations will change the numerical value of your code point/unit, but that may or may not make sense. Some characters (numbers) are laid out in sequence, but you'll need to know that whatever character you have is in a particular range, and that whatever add/subtract you do makes sense.

    Quote Originally Posted by hen_ry View Post
    4. How to check if I got a chinese unicode font in my computer.
    This is most certainly OS dependent. Newer versions of Windows come with a font called "Arial Unicode MS", which I believe is a complete Unicode font (ie, it has all the characters.)
    Again, this depends heavily on OS, programming API, etc. The Win32 API call GetFontUnicodeRanges looks like it could tell you. FreeType likely has a similar function. Other APIs probably too. (Tell us what API you're working with, and someone might be able to better help you.)
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  4. #4
    Registered User
    Join Date
    Apr 2009
    Posts
    10
    Hi,

    Thank you for your help.

    Henry

  5. #5
    Registered User
    Join Date
    Apr 2009
    Posts
    10
    Hi Cactus_Hugger,

    Thanks for your help.

    Is possible to write unicode in C?

    If so, what header file need to use in Borland C++ 6.0?

    Thanks

    Henry

  6. #6
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    Quote Originally Posted by hen_ry View Post
    Is possible to write unicode in C?
    Absolutely.
    Quote Originally Posted by hen_ry View Post
    If so, what header file need to use in Borland C++ 6.0?
    This is a harder question, with no real answer.

    Unicode, by itself, is not a character encoding (although the term Unicode is often used to mean 'UTF-16'). Unicode assigned one number to each character. Characters are often represents as U+[hex representation of number]. For example, U+0041 (0x41) is the latin letter 'A'; U+2823 is a braile pattern: ⠣ (you'll need a good browser and font here); U+221E is the infinity sign: ∞. Now that we have a single number representing each character, we have to encode them in memory/files/network data, etc. There are three common encodings for Unicode:

    UTF-32 - Encode each code point as-is, in a 32-bit integer. The largest unicode code point is 0x10FFFF, which will not fit in a 16-bit integer, but will in a 32-bit. UTF-32 is fixed width, meaning that each character takes the same number of bytes (4). Wikipedia page on UTF-32....

    UTF-16 - Encode each code point in a one or two 16-bit integers. UTF-16 is not a fixed width encoding - that is, a single character may take up 16-bits, or 32-bits in UTF-16. (A single 16-bit chunk is called a 'code unit'.) A character that uses two 16-bit code units in UTF-16 is called a surrogate pair. Wikipedia page on UTF-16....

    UTF-8 - Encode each code point in one or more 8-bit integers. UTF-8 has the advantage that any ASCII string is also a valid UTF-8 string. However, UTF-8 strings, due to their variable width characters, are harder to work with in memory, if you need to change characters or splice strings. Many XML files, HTML files, etc. are in UTF-8. Wikipedia page on UTF-8....

    So, now, back to your original question - which header file? None, really, though it still depends. If your data is UTF-8, you'll probably be storing it as a char *. If it's UTF-16, wchar_t (if it is 16-bit), or something close. Most of it is knowing your data; is this particular char * ASCII? UTF-8? Some other esoteric character encoding? There's a lot of information involving Unicode - go do some research. (I've only scratched the ice, here.)

    If you're using Windows, there is more information that you'll need to know. Google around. There's a good Wikibook on the subject too.
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  7. #7
    Registered User
    Join Date
    Apr 2009
    Posts
    10
    Hi Cactus_Hugger ,

    Thanks for your info.

    Henry

  8. #8
    Registered User
    Join Date
    Apr 2009
    Location
    Russia
    Posts
    116
    Quote Originally Posted by hen_ry
    1. Is possible to get a traditional chinese chars from user (input from keyboard)?
    2. Is number in unicode uses unsigned data type?
    You can get any utf character
    there is setlocale function in the standard
    and wchar.h with wide functions from the 1995 edit of ANSI C
    getchar => getwchar
    putchar => putwchar
    and wchar_t type for symbols in stddef.h

    under Windows you can have some problems with output characters (but it windows's problems only)
    you should set the console for chinese output, under linux I got another chinese characters although I have three different codings, I think I have no all codings - that is why
    also I included prototype directly although wchar.h has it inside (gcc said error: undeclared)

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Style Points
    By jason_m in forum C Programming
    Replies: 4
    Last Post: 05-28-2008, 06:15 AM
  2. Replies: 4
    Last Post: 04-03-2008, 09:07 PM
  3. Add/Delete Remotely a user
    By Scarvenger in forum Windows Programming
    Replies: 5
    Last Post: 03-24-2008, 08:36 AM
  4. Beginner question- user input termination
    By westm2000 in forum C Programming
    Replies: 3
    Last Post: 12-02-2001, 02:48 PM
  5. Stopping a user from typeing.
    By knave in forum C++ Programming
    Replies: 4
    Last Post: 09-10-2001, 12:21 PM