Thread: Reading .Txt file problem

  1. #1
    Registered User
    Join Date
    Dec 2010
    Posts
    6

    Reading .Txt file problem

    Hey, im new at "C" but i am developing fast, i understand file handling and can read from and to files.

    the problem is im workin with .ama files, dont worry these files are just like .txt but have a different extention.

    the problem is, the encoding setting on these files is UTF-8, which if u go into notepad you have UTF8, unicode and ansi.

    the problem im having is that c is not reading in some ascii characters correctly, for example, "TM" is not correctly read in, i mean TM the trademark symbol. if i save the file with ansi coding it works fine but i was wondering if there is a way round this inside c?

  2. #2
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,738
    Converting UTF8 to ASCII would be a good idea!
    Devoted my life to programming...

  3. #3
    Registered User
    Join Date
    Dec 2010
    Posts
    6
    yess but i would i do that inside c? i no i cud do this manually but i wud rather have something that wud do it inside my program

  4. #4
    Registered User
    Join Date
    Sep 2010
    Posts
    69
    the problem im having is that c is not reading in some ascii characters correctly,
    I don't have a chart in front of me, but, what is the code for "TM" ?
    I don't think "TM" is an ASCII code.
    If it's a multi-byte code, then (I think) you will need to store the entire file as multi-byte characters.

  5. #5
    Registered User
    Join Date
    Dec 2010
    Posts
    6
    o kk, how do i save it in that way, i only now the standard method

  6. #6
    Registered User
    Join Date
    Sep 2010
    Posts
    69
    Quote Originally Posted by Dan05011991 View Post
    o kk, how do i save it in that way, i only now the standard method
    The character you refer to appears to be an ANSI character:
    ANSI character set and equivalent Unicode and HTML characters

    You haven't provided any code showing how you are currently attempting to save the text file, but, you might review this thread:
    Reading a unicode file

    Also, if that does not provide the answer you need, you might try using the forums Search function: [ read unicode ], or, [ write unicode ]

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Assuming "Notepad" = "on Windows", you may want to look at MultiByteToWideChar Function (Windows)

  8. #8
    Registered User
    Join Date
    Dec 2010
    Posts
    6
    Hey, thanks for the replies, i have read alot of text and searched many times,
    it is in UTF-8 format i believe and the file contains mostly ansii characters with the odd ansi character

    i am reading the file in this way

    Code:
        do
    	{
    		fget(Test[a], 1000, Read_STD_AMAFile);
    		a++;
    	} while(!feof(Read_STD_AMAFile));
    it is being read into a char array , i have tried using wchar_t but it does not give the correct result

    inside the file it will have a line like
    Code:
    <Title ID="454107E6" Name="2006 FIFA World Cup™" LastPlayed="129138768000000000">
    as you can see after the games name it has "tm", if i read this using the code atm it gives "â„¢"

    just im not sure what i have done wrong

  9. #9
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    wchar_t is for an encoding where all characters are the same size. However, as you see, UTF-8 is not that kind of encoding, so you can't read the file in using wchar_t. You can write your own UTF-8 parser, or you can use the one that is provided (which would turn your output into wchar_t).

  10. #10
    Registered User
    Join Date
    Dec 2010
    Posts
    6
    thanks for the advice, how do i use the parser provided? sorry im abit amateur, im learnin in uni atm, and i sort of set myself the extra task to learn more

  11. #11
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Dan05011991 View Post
    thanks for the advice, how do i use the parser provided? sorry im abit amateur, im learnin in uni atm, and i sort of set myself the extra task to learn more
    You click on the link and you read the page.

  12. #12
    Registered User
    Join Date
    Dec 2010
    Posts
    6
    hey sorry to be a pest, which link or should i google some of it, i have been but cant seem to find anything

  13. #13
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Dan05011991 View Post
    hey sorry to be a pest, which link or should i google some of it, i have been but cant seem to find anything
    Quote Originally Posted by tabstop View Post
    Assuming "Notepad" = "on Windows", you may want to look at MultiByteToWideChar Function (Windows)
    Red things are links.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Problem reading file
    By coder_009 in forum C Programming
    Replies: 10
    Last Post: 01-15-2008, 01:22 PM
  2. Can we have vector of vector?
    By ketu1 in forum C++ Programming
    Replies: 24
    Last Post: 01-03-2008, 05:02 AM
  3. gcc link external library
    By spank in forum C Programming
    Replies: 6
    Last Post: 08-08-2007, 03:44 PM
  4. Problem reading from a file..
    By Candelier in forum C Programming
    Replies: 4
    Last Post: 04-23-2006, 12:42 AM
  5. Replies: 6
    Last Post: 05-12-2005, 03:39 AM