Unicode file I/O

**cunnus88** · 05-04-2006

Well, I just discovered that for the novice implementing Unicode I/O in C++ is as fun as having your teeth pulled out.

After scouring the web for hours, I found how to do it right here:

http://cboard.cprogramming.com/showt...hlight=wchar_t

Code:

wchar_t BOM = 0xFEFF;

What I don't understand about the solution is why the byte order markers are inverted (i.e. why it is not 0xFFFE), and why when it is written, it is flipped into the normal byte order.

Inferring from this, I thought all wide characters are flipped when written or read. So if a representation of 'b' in a text file might be 0x6200, I would have to manually flip it to become 0x0062 when concatenating it to my internal wstring.
Well, I was wrong.

So what is it? Is unicode inherently big or little endian (I'd assumed it was the former)? And can I go on using code like the following (which would mean unicode is little endian):

Code:

wstring wstr;
char ch[2];
while(ifilestream)
{
	ifilestream.read(ch, 2);
	wstr += *ch;
}

**Salem** · 05-05-2006

> And can I go on using code like the following (which would mean unicode is little endian):
I would suggest you use the wide character types.

Thread: Unicode file I/O

Thread Tools

Search Thread

Display

Unicode file I/O

Similar Threads

Newbie homework help

File transfer- the file sometimes not full transferred

Subtle(?) File I/O Problem

Unicode File I/O

Unknown Memory Leak in Init() Function