    Well, I just discovered that for the novice implementing Unicode I/O in C++ is as fun as having your teeth pulled out.

    After scouring the web for hours, I found how to do it right here:

    wchar_t BOM = 0xFEFF;
    What I don't understand about the solution is why the byte order markers are inverted (i.e. why it is not 0xFFFE), and why when it is written, it is flipped into the normal byte order.

    Inferring from this, I thought all wide characters are flipped when written or read. So if a representation of 'b' in a text file might be 0x6200, I would have to manually flip it to become 0x0062 when concatenating it to my internal wstring.
    Well, I was wrong.

    So what is it? Is unicode inherently big or little endian (I'd assumed it was the former)? And can I go on using code like the following (which would mean unicode is little endian):
    wstring wstr;
    char ch[2];
    {, 2);
    	wstr += *ch;

    > And can I go on using code like the following (which would mean unicode is little endian):
    I would suggest you use the wide character types.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

