Thread: Is this the right way to convert from utf16 to utf32?

  1. #1
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    1,737

    Is this the right way to convert from utf16 to utf32?

    I'm creating a library system for converting to/from utf32. The reason for doing so is in part because iconv() does not give the option to determine the amount of memory needed prior to conversion.

    The other reason is that WideCharToMultiByte()/WideCharToMultiByte are awkward to work with. I at least need char,utf8,utf16,utf32 and wchar_t support by default however so I'm writing the LE variants 1st then moving onto BE variants once I have the LE variant to base off of.

    This is what I have for UTF16-LE so far:
    Code:
    int64_t libpawmbe_getc( void vonst *src, size_t lim, size_t *did )
    {
    	char16_t const *txt = src;
    	char16_t c = txt[0];
    	if ( lim < sizeof(char16_t) )
    		return -PAWMSGID_INCOMPLETE;
    	if ( PAWINTU_BEWTEEN(0xDC00,c,0xDFFF) )
    		return -PAWMSGID_INVALIDPOS;
    	if ( PAWINTU_BEWTEEN(0xD800,c,0xDBFF) )
    	{
    		if ( lim < sizeof(char32_t) )
    			return -PAWMSGID_INCOMPLETE;
    		*did = sizeof(char32_t);
    		return ((char32_t)(c & 0x3FF) << 10) | (txt[1] & 0x3FF);
    	}
    	*did = sizeof(char16_t);
    	return  (c >= 0xE000) ? (c - 0xE000) + 0xD800 : c;
    }
    I'm confident I've understood the other formats correctly but not this one. wchar_t will be done the same way I did the char, with a temprary "hack" that uses the mbstate_t related stuff.

  2. #2
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    1,737
    Never mind, I also asked on reddit and got told the last return statement is wrong, I should just return c as is

  3. #3
    Registered User
    Join Date
    Dec 2017
    Posts
    1,664
    Did they also mention that you need to add 0x10000 to the 20 bits you've extracted from a surrogate pair?

    Also, your code assumes that the hardware is LE, but what if the hardware is BE?
    All truths are half-truths. - A.N. Whitehead

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 17
    Last Post: 07-26-2012, 09:32 PM
  2. Can C# convert to C++ 6
    By xing1987 in forum C++ Programming
    Replies: 3
    Last Post: 01-26-2011, 02:46 PM
  3. cannot convert 'int' to 'int &'
    By philvaira in forum C++ Programming
    Replies: 4
    Last Post: 10-27-2007, 02:38 AM
  4. convert maybe maybe not..
    By pico in forum C Programming
    Replies: 7
    Last Post: 03-15-2005, 10:13 AM
  5. How to convert C to Asm?
    By Sebastiani in forum C Programming
    Replies: 2
    Last Post: 03-24-2002, 04:22 AM

Tags for this Thread