Like Tree6Likes
  • 2 Post By whiteflags
  • 1 Post By whiteflags
  • 1 Post By Dave11
  • 1 Post By phantomotap
  • 1 Post By Codeplug

winapi: how to add unicode character according to it's unicode-number?

This is a discussion on winapi: how to add unicode character according to it's unicode-number? within the Windows Programming forums, part of the Platform Specific Boards category; for example - if I have LPWSTR or LPTSTR or even wchar_t variable - and I would like to add ...

  1. #1
    Registered User
    Join Date
    Dec 2013
    Posts
    108

    winapi: how to add unicode character according to it's unicode-number?

    for example - if I have LPWSTR or LPTSTR or even wchar_t variable -
    and I would like to add into these strings or assign the single variable the unicode character which code is u+1f37a (beer mug) - how do I do it ? I tried looking for code->character function in the winapi but i did not find any.
    any ideas guys?
    thanks in advanced.

  2. #2
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,744
    In C99 you can encode the data as a universal character name.
    Code:
    wchar_t beer_mug[] = L"\U0001f37a";
    That creates a string which will contain the beer glyph, not counting the terminating 0.
    Last edited by whiteflags; 08-23-2014 at 03:21 PM.
    Dave11 and Alpo like this.

  3. #3
    Registered User
    Join Date
    Dec 2013
    Posts
    108
    I'm trying to compile the line you gave as C++11 code, and the compiler gives the error:
    [Error] int-array initialized from non-wide string
    trying to use the line as
    Code:
    wchar_t beer_mug = '\U0001f37a';
    only makes it appear as a Chinese character, rather than a beer mug ( )

    edit:
    I will also add that in my code , i have the unicode macros:
    Code:
    #ifndef UNICODE
    #define UNICODE
    #endif
    
    #ifndef _UNICODE
    #define _UNICODE
    #endif
    Last edited by Dave11; 08-23-2014 at 03:24 PM.

  4. #4
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,744
    A couple of things:

    First, not all unicode characters are just one character. To be specific, UNICODE unambiguously refers to glyphs as code points, and certain encodings will require different memory space from the same data. The code point itself would be represented in 4 bytes. A UTF-16 encoding would require 2 wchar_t characters, for example. The data was meant to be encoded as a string like I showed you. That said, I did make a mistake: I forgot to prepend my quotes with an L. You can now see that I corrected myself above.

    You can rest assured that \U0001f37a unambiguously refers to beer.

    Second, is that you will have to use a font that supports the beer glyph to actually see it. You can see a list here. When I just try to use it, it prints nothing like a beer mug, but not all fonts have that glyph in the first place. Actually seeing the glyph would require more WinAPI code, and unfortunately, I'm not prepared to give that to you. I need time to do more research and see what works.
    Dave11 likes this.

  5. #5
    Registered User
    Join Date
    Dec 2013
    Posts
    108
    the rendering of the gylphs was solved when I created font-handle from "symbola" and sending it through "WM_SETFONT" which made all the glyphs render perfectly!
    thank you very much!
    whiteflags likes this.

  6. #6
    Registered User Alpo's Avatar
    Join Date
    Apr 2014
    Posts
    469
    I can't really add anything, but I had a question for y'all. Do you specifically use the wchar_t type when using Unicode, or do you use the TCHAR?

    I ask because I'm reading "Programming Windows" (5th ed.), and all the programs use TCHAR (and the TEXT token paste macro), but it's a bit old so I wasn't sure if people nowadays were just using nothing but Unicode or something.

  7. #7
    Registered User whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    7,744
    TCHAR just resolves to wchar_t when you define _UNICODE and build. I was specifically talking about wchar_t so I didn't bother. In a real program, its only use is porting programs to non-Unicode supporting platforms.
    Last edited by whiteflags; 08-24-2014 at 12:34 AM.

  8. #8
    Registered User Alpo's Avatar
    Join Date
    Apr 2014
    Posts
    469
    Quote Originally Posted by whiteflags View Post
    TCHAR just resolves to wchar_t when you define _UNICODE and build. I was specifically talking about wchar_t so I didn't bother. In a real program, its only use is porting programs to non-Unicode supporting platforms.
    I see, thanks. Yeah, the book actually has a whole chapter on the whole _UNICODE definition thing. The idea seemed pretty clever to me, although I didn't really get why they needed to make redefinitions of the symbolic constants they were using.

    For instance the TEXT macro starts off as __T, and then later gets a redefinition for convenience to _T, then in another place to TEXT. I wonder why they just didn't call it TEXT to begin? lol.

  9. #9
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    4,344
    For instance the TEXT macro starts off as __T, and then later gets a redefinition for convenience to _T, then in another place to TEXT. I wonder why they just didn't call it TEXT to begin?
    O_o

    Because they were born with unique usage in mind as a form of documentation.

    At least, that is what some Microsoft employees have said. I'd find it more likely that programmer Fred created `_T' due to being unaware that programmer Bob had made `TEXT' for the same purposes.

    Soma
    Alpo likes this.
    “Often out of periods of losing come the greatest strivings toward a new winning streak.” -- Fred Rogers
    “Salem Was Wrong!” -- Pedant Necromancer

  10. #10
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,680
    I understand it as: one is for the CRT, and one is for the Win32 API.

    "TEXT" (and "UNICODE" used) in WinNT.h.
    "_T", "_TEXT" (and "_UNICODE" used) in tchar.h.

    [edit]Ah, "leading underscore followed by uppercase reserved for the impl." - I'm guesing[/edit]

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unicode character size - preformance
    By Yarin in forum Tech Board
    Replies: 9
    Last Post: 04-05-2011, 04:43 PM
  2. unicode character recognitions
    By technosavvy in forum C Programming
    Replies: 2
    Last Post: 03-10-2008, 08:01 AM
  3. wide character (unicode) and multi-byte character
    By George2 in forum Windows Programming
    Replies: 6
    Last Post: 05-05-2007, 12:46 AM
  4. Unicode vurses Non Unicode client server application with winsock2 query?
    By dp_76 in forum Networking/Device Communication
    Replies: 0
    Last Post: 05-16-2005, 07:26 AM
  5. non-unicode app reading unicode texts
    By marsface in forum Windows Programming
    Replies: 5
    Last Post: 06-26-2003, 01:55 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21