utf8 en-de coding
Here's what I want to do :
1) decode utf8 stream to latin-1
2) remove all the accents
3) look for a town in a file
4) once the town is found (if it's found), the town is latin-1 encoded : encode the name of the town into utf8.
Any idea where i should take a look, and how I may implement (1) and (4) ?
Why do you need decode the entire stream to latin-1?
The town text that you are looking for - is in latin-1?
If so, why not just convert the text to Unicode (UTF-8) and search the stream as it is?
The specification to decode utf8 sequences:
You're perfectly right, so here's my new sequence :
Originally Posted by Codeplug
1) read utf8-encoded file where all the towns are
2) get the utf8-encoded stream and remove all the accents
3) search whether the utf8-string is a town or not
Thanks for your advice !
BTW : the system is Linux (The more I learn Linux the more I find Windows sucks)
>> ...and remove all the accents
This thread may help: http://cboard.cprogramming.com/showthread.php?t=97917