wchar_t, i18n, l10n and other oddities
Greetings all.
I'm writing myself a little GTK app and I'd like to have proper internationalization/localization support. I've done a lot of reading on the subject but a couple things still baffle me:
I've decided to default to UTF-8 which seems to be the norm for most programs now. I understand that UTF-8 is encoded and works with standard char, however, I've also read that I should be using wchar_t throughout my program instead in case I need to change to UTF-16 or other encodings. Is this true, or should I simply use char?
It's my understanding that wchar_t both increases the memory usage of a program rather dramatically and is also platform dependent (2 bytes in win32, 4 in linux, etc) so that's my most biting question...
Should I be using wchar_t or std::wstring?
What really is the difference between char/std::string and wchar_t/std::wstring besides basic encaspulation? Does it merely exist to make die hard C++ people warm and fuzzy or is there a real technical reason why the [w]string class is superior to simple [w]char arrays? (Sorry, I come from a C background)
How does encoded (UTF-8,UTF-16,etc) data work across a network? Is it basically the same as using char or are there "weird" things I need to know/look out for?
Along those lines, how does data look to a human if saved to a file? I'd like people in the US to be able to see the data in a regular text editor, but I'm afraid if everything is UTF-8 encoded, it'll look like garbly-gook....
Sorry for all the questions and I really appreciate any insight you can give me.
j