Thanks.
>> Both ifs evaluate as true.
But that code is not relevant to the question. The question is whether the compare will work as intended in all cases. Can you say for sure that there isn't any other wchar_t value that when compared against the char '0' will be evaluate to true? That's the question.
I don't think so.
The compiler treats the wchar_t as a word so it will compare a word against '0'.
If it compared only the high or low byte of the value, then we could get false positives, but from looking at the assembly, it doesn't appear to be so.
But I can do a test.
UPDATE:
Generates one output saying the if is true. So it's safe.Code:wchar_t w = 0; for (int i = 0; i <= 0xFFFF; i++, w++) { if (w == '0') cout << "YES! w == '0'!\n"; }
Assuming that the wide char encoding and the non-wide char encoding are equivalent enough to directly compare character values is probably unsafe.
>> The question is. Has the multibyte char L'0' the same value as the char '0' ?
That is not my question. My question is whether there are any other wchar_t values that when compared to '0' will return true. Will the wchar_t be converted to char and lose some of its information and become '0'.
Just don't try comparing chars to wchar_ts. Create a traits class with explicit specialisation for char and wchar_t
Then do:Code:if (w == specialCharTrait<T>zero())
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"
So for a new test I tried,
And it outputsCode:wchar_t w = 0; for (int i = 0; i <= 0xFFFF; i++, w++) { if (w == '0' && w == L'0') cout << "YES! w == '0' && w == L'0'!\n"; else if (w == '0') cout << "YES! w == '0'!\n"; else if (w == L'0') cout << "AND YES! w == L'0'!\n"; } unsigned char c = 0; for (int i = 0; i < 0xFF; i++, c++) { if (c == '0' && w == L'0') cout << "YES! w == '0' && w == L'0'!\n"; else if (c == '0') cout << "YES! w == '0'!\n"; else if (c == L'0') cout << "AND YES! w == L'0'!\n"; }
YES! w == '0' && w == L'0'!
YES! w == '0'!
And you don't need to worry about the wchar_t getting implicitly cast to a char. It's not, according to the assembly:
So basically, it stores the data in edi and uses the low-order word di to compare.Code:00401DB0 cmp di,30h 00401DB4 jne wmain+8Ah (401DCAh) 00401DB6 mov ecx,dword ptr [__imp_std::cout (403050h)] 00401DBC push 4040C0h 00401DC1 push ecx 00401DC2 call std::operator<<<std::char_traits<char> > (401000h) 00401DC7 add esp,8 00401DCA inc edi 00401DCB sub ebx,1 00401DCE jne wmain+70h (401DB0h)
The char loop is even easier:
Code:00401DD7 cmp bl,30h 00401DDA jne wmain+0C3h (401E03h) 00401DDC cmp di,30h 00401DE0 jne wmain+0B0h (401DF0h) 00401DE2 mov edx,dword ptr [__imp_std::cout (403050h)] 00401DE8 push 404108h 00401DED push edx 00401DEE jmp wmain+0BBh (401DFBh) else if (c == '0') cout << "YES! w == '0'!\n"; 00401DF0 mov eax,dword ptr [__imp_std::cout (403050h)] 00401DF5 push 404128h 00401DFA push eax 00401DFB call std::operator<<<std::char_traits<char> > (401000h) 00401E00 add esp,8 00401E03 inc bl 00401E05 sub ebp,1 00401E08 jne wmain+97h (401DD7h)
>> And you don't need to worry about the wchar_t getting implicitly cast to a char. It's not, according to the assembly
How does the assembly output of one compiler on a specific platform answer the general question? The whole point is whether you can guarantee it will be safe and there won't be any false positives, so the fact that it seems to work in one instance doesn't help much.
>> So basically, it stores the data in edi and uses the low-order word di to compare.
Isn't that the same as converting to char? What if '0' has different data in the high-order word than the wchar_t character you are comparing? They would compare as equal when they are not.
You're free to do your own tests, of course.
My tests show it's safe on Visual Studio 2008. More than that I cannot gaurantee unless it's mentioned in the standard.
wchar_t is 2 bytes (so a word) on Visual Studio, so the assembly is perfectly fine.>> So basically, it stores the data in edi and uses the low-order word di to compare.
Isn't that the same as converting to char? What if '0' has different data in the high-order word than the wchar_t character you are comparing? They would compare as equal when they are not.
I believe it uses the full edi to increase due to performance reasons as increasing an int would be faster than increasing only a word, though I'm not sure. I'm no expert.
There's no need to simply rely on observed behaviour. Just write it properly to begin with. Here is how to do it, which I meant to write this morning when I was in a mad rush:
Then do this where you use it:Code:template<typename T> class specialCharTraits {}; template<> struct specialCharTraits<char> { static char zero() { return '0'; } }; template<> struct specialCharTraits<wchar_t> { static wchar_t zero() { return L'0'; } };Code:if (w == specialCharTrait<T>::zero()) cout << "YES! w == '0'!\n";
My homepage
Advice: Take only as directed - If symptoms persist, please see your debugger
Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"