Thread: Data types in Unicode

  1. #1
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584

    Data types in Unicode

    I'm just working on Charles Petzold's Programming Windows and I'm on chapter 2: Unicode. I guess I'm not really understanding the data types of Unicode. I understand CHAR is a normal char and WCHAR and a wide char (16 bits), and then there is a whole slew of string types. Some with WCHARs, others with CHARs, and then long pointers, near pointers, const pointers. That is what is confusing. Which string pointer am I expected to use?

    And my other question, are chars (CHAR and WCHAR) and strings the only data types that are different? For instance, are ints and longs still ints and longs? chars and strings are only different, right? Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  2. #2
    of Zen Hall zen's Avatar
    Join Date
    Aug 2001
    Posts
    1,007
    If you want your app to compile for unicode and non-unicode enviroments you should use LPCTSTR for const char*'s and TCHAR for normal char's and char[]'s.

    All these are only typedefs, underneath they are normal C data types.

    Char's and strings are the only thing that unicode affects, but there are plenty of other typedefs that the win32 api uses to mean different things.
    Last edited by zen; 10-27-2001 at 10:17 AM.
    zen

  3. #3
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584
    Thanks, zen, but what would I use if I don't want a const character? What if I want to manipulate the string and change it? Then it can't be const, right?

    So, TCHAR is compatable? Wasn't there a PSTR to be a pointer to chars? Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  4. #4
    of Zen Hall zen's Avatar
    Join Date
    Aug 2001
    Posts
    1,007
    If you want to modify a string, you should use an array of TCHAR's rather than a LPTSTR to store the string. But you could use a LPTSTR to point to or manipulate an existing string.

    A PSTR/LPSTR (both are char*) are a pointers to a non-unicode(8 bit/char) strings.
    zen

  5. #5
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584
    Do you suggest using Unicode? I guess I don't have much of a choice because Programming Windows teaches in unicode. Honestly, I'm not really sure which code is in unicode. Is it just data types? Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  6. #6
    of Zen Hall zen's Avatar
    Join Date
    Aug 2001
    Posts
    1,007
    Yes, if you want to program for Windows, as from Win2k onwards the o/s's use unicode natively. Which means that if you use ansi then all the characters/strings have to be converted to unicode internally, which will slow things down.

    You don't have to worry which data is in unicode or not (unless you're reading external files), just use the suggested character/string types and the TEXT/_T macros on literals and if you want a unicode program define _UNICODE. Everything will be the correct type (functions and strings) depending on whether _UNICODE is defined.
    zen

  7. #7
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584
    So, you're saying that if I don't do this:

    #define _UNICODE

    then the program will run slower because it then has to conver to unicode? So, I should define UNICODE in every source? And implement it? Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  8. #8
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584
    > and if you want a unicode program define _UNICODE

    I guess I don't understand what you mean by this. Do I have to define _UNICODE to use it? I thought that I was using the TEXT macro and CHAR and TCHAR and PSTR without it, wasn't I?

    Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  9. #9
    of Zen Hall zen's Avatar
    Join Date
    Aug 2001
    Posts
    1,007
    So, you're saying that if I don't do this:

    #define _UNICODE

    then the program will run slower because it then has to conver to unicode?
    Yes on win2K and above. However your program probably wont work on 9x, as it has little support for unicode.

    Do I have to define _UNICODE to use it?
    Yes, the macros are there to make it easy to compile your code in ansi or unicode (to switch between them without having to go through and change all the types manually). If _UNICODE is defined all the functions that accept char/strings and all the char/strings themselves magically become the wide variety. If it's not defined then they are the ansi variety.

    In the header files there more complex versions of this -

    #ifdef _UNICODE
    typedef TCHAR wchar_t;
    #else
    typedef TCHAR char;
    #endif
    zen

  10. #10
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584
    I understand. So, I will always use TCHAR? Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  11. #11
    Banned Troll_King's Avatar
    Join Date
    Oct 2001
    Posts
    1,784
    C#'s char type is 16-bit unicode.

  12. #12
    the Corvetter
    Join Date
    Sep 2001
    Posts
    1,584
    > C#'s char type is 16-bit unicode.

    I know, but what I'm saying is whether or not _UNICODE is defined, TCHAR will always be the proper data type. Whether it is defined as char or wchar_t.

    Thanks.

    --Garfield
    1978 Silver Anniversary Corvette

  13. #13
    Banned Troll_King's Avatar
    Join Date
    Oct 2001
    Posts
    1,784
    In C# you just use char, and it is always unicode on all systems.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. What are abstract data types
    By bhagwat_maimt in forum C++ Programming
    Replies: 4
    Last Post: 01-04-2007, 10:43 AM
  2. Replies: 4
    Last Post: 06-14-2005, 05:45 AM
  3. Binary Search Trees Part III
    By Prelude in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 10-02-2004, 03:00 PM
  4. gcc problem
    By bjdea1 in forum Linux Programming
    Replies: 13
    Last Post: 04-29-2002, 06:51 PM
  5. Using enumerated data types
    By SXO in forum C++ Programming
    Replies: 7
    Last Post: 09-04-2001, 06:26 PM