You are confused. UTF-16 is a variable width encoding. That means it can be longer if necessary, just like UTF-8. A 16-bit value cannot represent all unicode characters (which is around 100.000 iirc), so it's variable width. Usually, the first byte will tell how long the character value is (is it 1, 2 or 3 bytes?). So the application processing the unicode string will interpret the first or second byte (I have no idea how UTF-16 works), and then read the applicable amount of characters that is needed to represent the entire unicode value.
Does that make any sense?