Hello everyone,
skipws works for both char based and wchar_t based string stream? I have not found formal clarification from MSDN.
http://msdn2.microsoft.com/en-us/library/98bsd5x4.aspx
thanks in advance,
George
Hello everyone,
skipws works for both char based and wchar_t based string stream? I have not found formal clarification from MSDN.
http://msdn2.microsoft.com/en-us/library/98bsd5x4.aspx
thanks in advance,
George
Yes.
All the buzzt!
CornedBee
"There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
- Flon's Law
Don't mistake narrow strings for ANSI or wide strings for UNICODE. First, both terms are stupid Microsoftisms with no little connection to proper terminology. Second, nothing in the C++ standard says what encodings the strings have to be in.
But '\n' and L'\n' and the other pairs are indeed supposed to have the same semantic meaning when printed.
All the buzzt!
CornedBee
"There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
- Flon's Law
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
\n is a newline. But the encoding of this newline is implementation-defined.
ANSI/UNICODE are the stupid terms. ANSI is the American National Standards Institute. Win32 came to misuse the term to mean "an encoding specified by ANSI", like ASCII, but this term is extremely misleading. The default narrow character set on US or Western European Windows installations is Windows-1252, an adaption of ISO-8859-1, but standardized by no one. There are various other Windows-* codepages, all called ANSI, but very few, if any, are standardized. Even Microsoft says it's stupid. I quote Wikipedia:
As for UNICODE, the term is derived from the macro name that switches the Win32 API generic names between the multibyte and wide character variants (A and W suffixes). The real Unicode document and consortium are not spelled all-uppercase. Also, Unicode is a character set and a set of algorithms for properly handling international character data. It also defines a number of encodings for this character set, the most important of which are UTF-8, UTF-16 and UTF-32. What Windows programmers refer to as UNICODE is really UTF-16, or (in Windows NT) even the crippled UCS-2.Microsoft has stated that "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community"
All the buzzt!
CornedBee
"There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
- Flon's Law
Hi CornedBee,
I am confused what ANSI means. Previously I think it means the current code page of the current locale, and it has different value for different locale, for example, ANSI in western and ANSI in Japan are two different code page. i.e. ANSI means a specific codepage on a specific locale platform.
But in your words, "There are various other Windows-* codepages, all called ANSI", seems ANSI means all of the codepages?
Could you help to clarify please?
regards,
George
ANSI can refer to the code pages or the mode where Windows uses the code pages. Does it matter that much?
All the buzzt!
CornedBee
"There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
- Flon's Law
Thanks CornedBee,
1.
I have made some self-study.
http://en.wikipedia.org/wiki/Code_page
Looks like ANSI code page means a set of code pages, and not a specific code page.
2.
ANSI code pages are all multi-byte encoding? Other than wide character?
regards,
George
2) Single-byte or multi-byte, but not wide.
All the buzzt!
CornedBee
"There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
- Flon's Law
I think CornedBee already answered that one. But I'll have a go at doing it differently:
Neither ANSI nor UNICODE exactly describes the term. ANSI actually means "one of a number of different variants of 8-bit character sets that are based on ASCII but extended to 8 bits".
UNICODE refers to a standard that supports several different formats, including 8-bit version(s), 16-bit version(s) and 32-bit versions. So again, it's not a precise definition of what is being used.
--
Mats
Compilers can produce warnings - make the compiler programmers happy: Use them!
Please don't PM me for help - and no, I don't do help over instant messengers.
Thanks Mats,
Can I understand in this way?
1. UNICODE is character/value mapping, each specific character only has one specific value in UNICODE table;
2. Codepage is how a character/UNICODE value is represented and encoded in memory/storage/..., different code page will (may) represent the same character in different encoding values.
Both are correct?
regards,
George