Thread: VC++ not Unicode friendly? C2001: newline in constant

  1. #1
    Registered User
    Join Date
    Mar 2009
    Posts
    46

    VC++ not Unicode friendly? C2001: newline in constant

    I've got this little bit of simple code, and it produces lots of "C2001: newline in constant" errors.

    The odd thing is that the source code file is in Unicode (UTF-8) and the program is being compiled in Unicode mode (Character Set: Use Unicode Character Set). So why is it complaining?

    Code:
    #define ARROW_DOWN	138 /* '\x8A' */
    #define ARROW_LEFT	139 /* '\x8B' */
    #define ARROW_RIGHT	140 /* '\x8C' */
    #define ARROW_UP	141 /* '\x8D' */
    
    #ifdef WINDOWS
    #  define ARROW_N __T('↑')
    #  define ARROW_E __T('→')
    #  define ARROW_S __T('↓')
    #  define ARROW_W __T('←')
    #  define ARROW_NE __T('↗')
    #  define ARROW_SE __T('↘')
    #  define ARROW_SW __T('↙')
    #  define ARROW_NW __T('↖')
    #  define ARROW_WAIT __T('↻')
    
    /* Analogous to isdigit() etc in ctypes */
    #  define isarrow(c)	( \
    	(((int)c >= ARROW_DOWN) && ((int)c <= ARROW_UP)) \
    	|| (((int)c >= ARROW_W) && ((int)c <= ARROW_S)) \
    	|| (((int)c >= ARROW_NW) && ((int)c <= ARROW_SW)) \
    	|| ((int)c >= ARROW_WAIT) \
    	)
    
    #else
    
    #  define isarrow(c)	( \
    	(((int)c >= ARROW_DOWN) && ((int)c <= ARROW_UP)) \
    	)
    
    #endif /* WINDOWS */
    I need to have lots of Unicode text in my project or I will be put to a great deal of additional effort. If it's going to do this all the time I'll be really annoyed.

  2. #2
    Registered User
    Join Date
    Mar 2009
    Posts
    46
    Nevermind folks - I've found the problem.

    For future reference these errors occurred because I was saving with

    Unicode (UTF-8 without signature) - Codepage 65001

    as soon as I switched to

    Unicode (UTF-8 with signature) - Codepage 65001

    they disappeared.

    This is not nice behaviour on behalf of VC++, but at least it's liveable with (until I want to support Linux ;-).

  3. #3
    Registered User
    Join Date
    Oct 2008
    Posts
    1,262
    To be honest, I found unicode support is lacking in general, in all of the C++ world (though C++0x is going to fix that, at least till some extends). If you only need it to run under VC++, there may be some easy method that works. However, that would completely kill all portability.

    If you need it portable, I'd advise you to do this: Pick either UTF8, UTF16 or UTF32 to store all data in. UTF32 is a lot easier, while UTF8 is generally a lot smaller, and UTF16 is in the middle (and commonly the worst of both worlds, in stead of the best).
    Now, either get or make your own string class that supports this. In case of UTF32 you can just use std::string<uint32_t> or something similar. You'll probably also have to write some conversion routines between UTF's (at least from/to UTF8), but if you go to the wikipedia pages the implementations of these algorithms are trivial.
    Then write all the text to some format in a text file and store it using a UTF you picked (might be another as well). Read it appropriately, convert it if required, and read it into the string class.
    It's even more fun if you actually want to use it, because it is completely OS dependant. Want to show it in a Windows messagebox? That's UTF16. Want to show it in a linux console? UTF8 (well, in my case it was). Some other linux window manager? Probably depends on which one.

    All in all, a mess to use if you need it to be portable.

  4. #4
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> # define ARROW_N __T('↑')
    Two issues:
    1) There's no point is using __T(). TCHAR's are for code that needs to support SBCS, MBCS, and Unicode at the same (depending on the build target). What you want to use is the Unicode character U+2191. So you could do something like this:
    Code:
    const wchar_t ARROW_N = L'\u2191';
    2) Don't use extended characters within your source code. There's no telling what you'll get after the compiler has translated from the "source character set" to the "execution character set" - since it's implementation defined.

    gg

  5. #5
    Registered User
    Join Date
    Mar 2009
    Posts
    46
    Quote Originally Posted by Codeplug View Post
    >> # define ARROW_N __T('↑')
    Two issues:
    1) There's no point is using __T(). TCHAR's are for code that needs to support SBCS, MBCS, and Unicode at the same (depending on the build target). What you want to use is the Unicode character U+2191. So you could do something like this:
    Code:
    const wchar_t ARROW_N = L'\u2191';
    I'm not using __T( for it's MBCS / SBCS functionality, I'm using it as a handy-to-remember macro name that I can redefine or search/replace at a later point.

  6. #6
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Then I would suggest making and using your own macro. __T() has an implied purpose when used (at least when others are reading your code).

    gg

  7. #7
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by PaulBlay View Post
    Nevermind folks - I've found the problem.

    For future reference these errors occurred because I was saving with

    Unicode (UTF-8 without signature) - Codepage 65001

    as soon as I switched to

    Unicode (UTF-8 with signature) - Codepage 65001

    they disappeared.

    This is not nice behaviour on behalf of VC++, but at least it's liveable with (until I want to support Linux ;-).
    This is not nice of you.
    How is VC++ supposed to know what format the file is saved in without a signature?
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #8
    Registered User
    Join Date
    Mar 2009
    Posts
    46
    Quote Originally Posted by Elysia View Post
    This is not nice of you.
    How is VC++ supposed to know what format the file is saved in without a signature?
    By looking at it. Visual Studio does, after all, display the file correctly in the IDE. (As set with the "Auto-detect UTF-8 encoding without signature." option) So the left hand knows what it's doing - it didn't occur to me that the right hand hadn't been informed.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Screwy Linker Error - VC2005
    By Tonto in forum C++ Programming
    Replies: 5
    Last Post: 06-19-2007, 02:39 PM
  2. We Got _DEBUG Errors
    By Tonto in forum Windows Programming
    Replies: 5
    Last Post: 12-22-2006, 05:45 PM
  3. load gif into program
    By willc0de4food in forum Windows Programming
    Replies: 14
    Last Post: 01-11-2006, 10:43 AM
  4. Possible circular definition with singleton objects
    By techrolla in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2004, 10:46 AM
  5. UNICODE and GET_STATE
    By Registered in forum C++ Programming
    Replies: 1
    Last Post: 07-15-2002, 03:23 PM