    Why one character size but two printed on screen


    I would like to know why when I print in screen one byte it shows two characters.
    Example in french language printing one byte like "oe" (sticked together, 0x9c ascii code ) gives me two characters on screen.
    Is there function other than strlen() to count exactly how many character will be printed.

    Thank you

    Can you show us your code?
    The oe doesn't have an ASCII code. ASCII is a 7-bit encoding that only encodes 128 characters, quite a few of them special control characters. It only contains the basic English alphabet, the digits 0 through 9, and a few punctuation and whitespace characters. It does not contain any diacritics or foreign characters at all.

    The code you're referring to could be the code the character has in the old IBM PC codepage (now referred to as OEM in Windows), or it could be the Windows-1252 codepage (standard Windows "ANSI" codepage), or it could be the ISO-8859-1 codepage (very similar, and very common - it's standard on most Linux systems).

    However, what is happening is that your source file is actually in the UTF-8 encoding, where the oe character needs 2 bytes to be encoded. However, the runtime still interprets it as something else (probably ISO-8859-1, as I've just noticed this is the Linux forum) and assumes that each byte is a single character. Thus it writes two characters.

    There's no really good solution. Character encodings are one area of C/C++ that I find truly lacking.
    I can not give here my code. In fact my string are extracted
    from text from file on disk.
    The text is raw (only \r and \n as new line layout) typed on ms-windows
    french lang. Typeface is courrier new.

    I tried to works with UTF8 (as output only) but I have problem on accent.
    I will post another article concerning fench accent in raw text:
    "How to convert raw text with accent to UTF8"

