Thread: <string> to LPCSTR? Also, character encoding: UNICODE vs ?

  1. #1
    Registered User
    Join Date
    Aug 2006
    Posts
    74

    Question <string> to LPCSTR? Also, character encoding: UNICODE vs ?

    Okay this is a two part question the first being a direct programming question and the second being a more vague programming question. Feel free to answer one; both; or none.
    -----------------------------------------------------------------------
    Question 1: I always have used char * when I need to use strings, however, everyone seems to really like <string> so I am giving it a try. However, when a function needs a LPCSTR passing in a <string> object gives a compile time error and trying to cast won't work, which I can understand.

    Code:
    LoadLibrary(gameName);
    Code:
    error C2664: 'LoadLibraryA' : cannot convert parameter 1 from 'std::string' to 'LPCSTR'
    Anyway to get the above to work or should I just go back to char*?
    -------------------------------------------------------------------------
    Second question:

    I am a bit confused on this issue of character encoding. I never really looked into it before so I didn't know the difference between ANSI & Unicode. However, I recently decided to do a GOOGLE to learn what these are all about.

    By default Microsoft Visual Studio has unicode character encoding set and this caused my code not to compile as I had learned the language:

    i.e.
    Code:
     windowClassEx.lpszClassName = "Main";
    Would result in a compile time error of:
    Code:
    error C2440: '=' : cannot convert from 'const char [4]' to 'LPCWSTR'
    I was told that i needed to use the TEXT() to get it to work or a simple setting adjustment of not using Unicode would correct this issue. Of course, not being partial to change I chose to change the character set away from UNICODE. However, I accidentaly stumbled across this site in my attempt to understand character sets:

    http://www.csc.calpoly.edu/~bfriesen/software/builds.html

    It seems to imply that ANSI was meant for early versions of windows (95/98/ME) while UNICODE will actually perform faster on a newer OS (NT/2000/XP) than an ANSI program. This accurate?

    If so, am I hurting my programming practices by not using UNICODE?

    Right now I'm not sure what I'm using. I have the Character set option to "Not set", but if I set it to "Use Multi-Byte Character Set" my program will still compile as is. Uh, is Multi-Byte the same as UNICODE cause I would think ANSI is just stored as one byte (256 variations).. no? Err, I'm kinda confused on this.

    Should I have it set to UNICODE and be using the TEXT macro? Just wondering cause UNICODE is the default setting in Microsoft Visual Studio 2005 and I'm assuming its that way for a reason. Also, if I choose UNICODE will my program run under (windows 95/98/ME)? What other advantages would UNICODE have? I understand it is designed to handle worldwide languages/characters, but what good would that do for my program?

  2. #2
    erstwhile
    Join Date
    Jan 2002
    Posts
    2,227
    cannot convert parameter 1 from 'std::string' to 'LPCSTR'
    Use the std::string::c_str member function eg.,
    Code:
    LoadLibrary(gameName.c_str());
    Anyway to get the above to work or should I just go back to char*?
    You should be using TCHAR which maps to char or whar_t depending on whether UNICODE and _UNICODE are #defined. As is evidenced by:
    cannot convert from 'const char [4]' to 'LPCWSTR'
    So now you need to use both std::string and std::wstring and a typical way of providing an stl string for use with unicode is to create a suitable alias, eg:
    Code:
    typedef std::basic_string<TCHAR> unicode_string;
    So, where you previously may have used std::string you just use unicode_string which will resolve to the correct type depending on the #definition of UNICODE and _UNICODE as previously mentioned.
    If so, am I hurting my programming practices by not using UNICODE?
    If all you're ever writing are applications targetting english speakers then it's not probably a huge issue, although winnt/2k/xp are unicode natively so any code that doesn't use unicode will suffer a slight performance hit as strings are converted internally by the operating system. Given that it just requires a few small modifications to your coding habits to ensure unicode compatibility it would seem to me to be the best option to go with sooner rather than later. Just remember if you are building unicode applications to #define both UNICODE and _UNICODE, preferably in your compiler settings; msvc2005 (including the express edition) compilers all define these macros by default, as you have already observed. Any string literals you use should be defined with the _T or TEXT macros (#include <tchar.h>, too); if you're building exclusively for unicode then prefix your string literals with 'L', instead and just use wide character strings (or std:wstring) exclusively..

    Your unicode programs will not run under win9x unless those systems have the microsoft layer for unicode installed; it's probably safer to assume they won't and just use the UNICODE, _UNICODE TEXT and _T macros in your code and typedef a suitable string/wstring alias that will resolve to correct type and build for the target platform.

    Search the boards as there have been a number of discussions about unicode in the past which may be of interest to you.
    CProgramming FAQ
    Caution: this person may be a carrier of the misinformation virus.

  3. #3
    Registered User
    Join Date
    Aug 2006
    Posts
    74
    Okay! Thats a lot of information to take in, but I'm gonna use it all.

    Also, to anyone reading this I also stumbled across this site by accident, which greatly cleared up my questions regarding the difference between Multi-byte & Unicode and expands the above poster's (Ken Fitlike) answer/advice. (how Unicode is represented in memory; valid string manipulation functions to use; etc.):

    http://www.codeproject.com/string/cppstringguide1.asp

    Thanks again mate! Very good.


    Edit: Oop, found this link too: http://www.flipcode.com/articles/art...trings01.shtml
    Edit#2: Ooh, and this one: http://msdn2.microsoft.com/en-us/library/c426s321.aspx
    Last edited by Kurisu33; 10-07-2006 at 07:23 PM.

  4. #4
    Registered User
    Join Date
    Aug 2006
    Posts
    74
    Okay, my program is now setup to support both Unicode & Non-unicode. (only took a few minutes.. nice! No hassle at all)

    Anyways I still have a few questions:

    1) Ex:
    Code:
    typedef std::basic_string<TCHAR> UnicodeString;
    
    UnicodeString name = TEXT("Bob");
    name.c_str();
    .c_str() I did not find this located in documentation. What is it doing exactly? Just returning a char* or wchar_t* from my String object depending on whether Unicode is set or not?

    2) I'm now getting a WinMain function cannot be overloaded error. What is the correct definition under UNICODE? Below is what I am trying to use:

    Code:
    int APIENTRY WinMain(HINSTANCE instance, HINSTANCE prevInstance, LPTSTR cmdLine, int cmdShow);
    3)
    Code:
      handleDrag = (handleDragFunction)GetProcAddress((HMODULE)gameLibInst, TEXT("handleDrag"));
    gives this error:
    Code:
    error C2664: 'GetProcAddress' : cannot convert parameter 2 from 'const wchar_t [11]' to 'LPCSTR'
    ? Hmm... why is GetProcAddress not a UNICODE function? (Note: If remove TEXT() from 2nd param it works)

    4) The <tchar.h> header... why do I need to include this? Is it required for TEXT macro and stuff? If so, my program uses the TEXT macro without having to include this file, am I to assume my compiler is automatically including this when set to UNICODE character set? (MSVC 2005 Express Edition)


    Edit: For question 2 on entry points replacing WinMain() with wWinMain() seems to work, but now would my program be ANSI compatible? Why does WinMain() not work like other functions where depending on the character set the appropriate function is called through typedefs? (either WinMain() or wWinMain())
    Last edited by Kurisu33; 10-07-2006 at 09:20 PM.

  5. #5
    Cat Lover
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    109
    1. I believe c_str just returns const char * rather than changing whether or not you're using unicode.

    http://msdn2.microsoft.com/en-us/library/3372cxcy.aspx

    2. Not a clue off the top of my head.

    3. On GetProcAddress, according to http://blog.voidnish.com/?p=70 unless you're on Windows CE there's no unicode version.

    4. Wouldn't tchar.h contain the definitions for the TCHAR datatype and similar?
    If you create a non-empty Windows application it's automatically included already in stdafx.h I think.

  6. #6
    Registered User
    Join Date
    Aug 2006
    Posts
    74

    Question

    Quote Originally Posted by Dweia
    1. I believe c_str just returns const char * rather than changing whether or not you're using unicode.
    Oops, it just dawned on me to do a little testing via the debugger.. Don't know why I didn't think of this before.. sometimes I'm a little dumb It turns out that under unicode c_str() returned a const *wchar_t. Of course my string is of TCHAR type to get this to work.

    Quote Originally Posted by Dweia
    3. On GetProcAddress, according to http://blog.voidnish.com/?p=70 unless you're on Windows CE there's no unicode version.
    Answered my question perfectly :P

    Quote Originally Posted by Dweia
    4. Wouldn't tchar.h contain the definitions for the TCHAR datatype and similar?
    If you create a non-empty Windows application it's automatically included already in stdafx.h I think.
    Hmm.. makes sense . I'm using TCHAR without <tchar.h> and an empty project so no <stdafh.h> so I guess <tchar.h> is automatically included in MSVC 2005 Express...


    Hmm.. just need to know about WinMain() vs wWinMain() now.. can I just use wWinMain() for both Unicode and ANSI? The 'w' stands for wide character set no? So, I'm alittle unsure if I can use it for ANSI or not... What I don't understand is the two functions are identical!?! No changes from say (LPCSTR cmdLine-> LPWSTR cmdLine) so what sets the two functions apart I do not know.. do I need to try and write something like this for my code?:

    Code:
    #ifdef UNICODE
    #define WinMain  wWinMain
    #else
    #define WinMain  WinMain
    #endif
    Last edited by Kurisu33; 10-08-2006 at 10:55 AM.

  7. #7
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    <windows.h> contains TCHAR, LPTSTR, LPCTSTR, and the TEXT() macro. It switches on presence of the UNICODE macro.
    <tchar.h> contains _TCHAR, the _TEXT() and _T() macros (they are equivalent) and the macros tmain and tWinMain. It switches on the presence of the _UNICODE macro.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  8. #8
    Registered User
    Join Date
    Aug 2006
    Posts
    74

    Thumbs up

    Quote Originally Posted by CornedBee
    <windows.h> contains TCHAR, LPTSTR, LPCTSTR, and the TEXT() macro. It switches on presence of the UNICODE macro.
    <tchar.h> contains _TCHAR, the _TEXT() and _T() macros (they are equivalent) and the macros tmain and tWinMain. It switches on the presence of the _UNICODE macro.
    Ah thanks that clears things up... I was able to look in <tchar.h> and basically their macro was:

    Code:
    #ifdef _UNICODE
    #define _tWinMain wWinMain
    #else
    #define _tWinMain WinMain
    #endif
    So indeed Unicode and ANSI use different entry points.

    Hmm.. looks like all my questions are fully answered.. thanks for all the replies

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unicode - a lot of confusion...
    By Jumper in forum Windows Programming
    Replies: 11
    Last Post: 07-05-2004, 07:59 AM
  2. Character handling help
    By vandalay in forum C Programming
    Replies: 18
    Last Post: 03-29-2004, 05:32 PM
  3. wchar_t type
    By gustavosserra in forum C++ Programming
    Replies: 5
    Last Post: 11-02-2003, 04:49 PM
  4. mygets
    By Dave_Sinkula in forum C Programming
    Replies: 6
    Last Post: 03-23-2003, 07:23 PM
  5. UNICODE and GET_STATE
    By Registered in forum C++ Programming
    Replies: 1
    Last Post: 07-15-2002, 03:23 PM