(from 4.1.5 Common definitions <stddef.h>)Quote:
which is an integral type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales
The standard clearly says that a single wchar_t can represent all character codes in a character set. It doesn't say that multiple wchar_t's can be used to represent a single character code. UTF-8 is a variable-length encoding that requires between 1 and 4 octets to represent a single character code, so if wchar_t were in UTF-8 it would likely be defined as an 8-bit type (any wider would be wasteful), and multiple wchar_t's would be required to contain a single character in some cases. That is a direct violation of the standard.
UTF-16 is also a multibyte character encoding. Systems that use UTF-16 values in their wchar_t are also violating the standard in spirit, if not in letter, if they support surrogate pairs to contain a single character code. If a system uses UCS-2 or UTF-16 for wchar_t but does not support surrogate pairs (essentially a subset of UCS-2 that is defined only for codes 0x0000-0xD7FF and 0xE000-0xFFFF), then it can support only a subset of Unicode, but that is fine as far as the C standard is concerned. Likewise, a system that uses UTF-8 for wchar_t but supports only single-byte encoding sequences has a limited character set (essentially the ASCII encoding), but that too is fine by the standard.
Any conforming implementation that supports the full Unicode character set must have a wchar_t type of at least 21 bits wide. Full stop.