utf8 en-de coding

This is a discussion on utf8 en-de coding within the C Programming forums, part of the General Programming Boards category; Here's what I want to do : 1) decode utf8 stream to latin-1 2) remove all the accents 3) look ...

  1. #1
    Registered User
    Join Date
    Mar 2008
    Location
    France
    Posts
    9

    utf8 en-de coding

    Here's what I want to do :
    1) decode utf8 stream to latin-1
    2) remove all the accents
    3) look for a town in a file
    4) once the town is found (if it's found), the town is latin-1 encoded : encode the name of the town into utf8.

    Any idea where i should take a look, and how I may implement (1) and (4) ?

  2. #2
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,669
    What OS?

    Why do you need decode the entire stream to latin-1?
    The town text that you are looking for - is in latin-1?
    If so, why not just convert the text to Unicode (UTF-8) and search the stream as it is?

    gg

  3. #3
    Registered User
    Join Date
    Apr 2008
    Posts
    395
    The specification to decode utf8 sequences:
    http://www.faqs.org/rfcs/rfc3629.html

  4. #4
    Registered User
    Join Date
    Mar 2008
    Location
    France
    Posts
    9
    Quote Originally Posted by Codeplug View Post
    What OS?

    Why do you need decode the entire stream to latin-1?
    The town text that you are looking for - is in latin-1?
    If so, why not just convert the text to Unicode (UTF-8) and search the stream as it is?

    gg
    You're perfectly right, so here's my new sequence :
    1) read utf8-encoded file where all the towns are
    2) get the utf8-encoded stream and remove all the accents
    3) search whether the utf8-string is a town or not

    Thanks for your advice !

    BTW : the system is Linux (The more I learn Linux the more I find Windows sucks)

  5. #5
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,669
    >> ...and remove all the accents
    This thread may help: Getting ASCII equivalent of UNICODE string

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Another syntax error
    By caldeira in forum C Programming
    Replies: 31
    Last Post: 09-05-2008, 01:01 AM
  2. pointer problem or so...
    By TL62 in forum C Programming
    Replies: 19
    Last Post: 01-12-2008, 10:45 PM
  3. Help!
    By Tyrant in forum C Programming
    Replies: 19
    Last Post: 12-04-2007, 02:25 AM
  4. Error in reading from file
    By paulovitorbal in forum C Programming
    Replies: 4
    Last Post: 05-02-2006, 06:15 AM
  5. 1337 bible, Gen 11
    By Paz_Rax in forum A Brief History of Cprogramming.com
    Replies: 5
    Last Post: 05-20-2005, 09:40 PM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21