Thread: TEXT FILE TO ASCII - newbie

  1. #1
    Registered User
    Join Date
    Oct 2004
    Posts
    8

    TEXT FILE TO ASCII - newbie

    [COLOR=Red]I need a kick start on developing an algorithm for a program that reads and extracts certain information from a text file and converts the information to ASCII format. The greatest challenge is reading the file and extracting relevant data. I don't know whether the best approach is to do string search. Help please.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    An awful lot of text files are also ASCII files, so what's the question?

    Perhaps posting some examples of input data and output results, and some description of the relationship would be useful
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    and the Hat of Clumsiness GanglyLamb's Avatar
    Join Date
    Oct 2002
    Location
    between photons and phonons
    Posts
    1,110
    Iīm guessing you want to do something like this since probably you want someting like HTML entities as output

    input.txt
    Code:
    Hey, I need help
    output .txt see attachment since i was not able to put it here as it seems that the entities when putting them inside code tag and quote tags are still being displayed like normal characters.

    Note: I was not sure about the spaces wether to use Non breaking spaces or just normal spaces anyhow, hope you mean something like this

    Greets,

    Ganglylamb.

    ::edit::

    If this is what you want to do then best thing is to read in one line at the time and go trough the string then replacing all known characters into the entities, best thing is to work with arrays else you will have alot of if statements to write.
    When using this arrays you will have 2 arrays one with all characters inside and the other with all the entities then for instance
    char characters[i] will have it counterpart as entitie in entities[i]...

    Something like that i guess, not sure since i dont know what youre trying to accomplish.
    Last edited by GanglyLamb; 10-18-2004 at 07:08 AM.

  4. #4
    Code:
    char myString[30];
    char myString2[50];
    
    fstream file;
    file.open("input.txt", ios::in);
    file.getline(myString, 30, '=');
    file.getline(myString2, 50, ';');
    will read the first line from a file that
    is formated like this :

    Code:
    color=red;
    myString will hold color which you can discard if
    you dont need, and myString2, will hold red, there color
    choice.

  5. #5
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    JJB, this is the C forum.
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  6. #6
    Registered User
    Join Date
    Oct 2004
    Posts
    8
    Thanks everyone for your responses and criticism. I guess I should have elaborated more, and my apologies for not phrasing my questions right, it probably means I am still on the steep learning curve. But here goes:

    1. I am extracting data from a text file – and as you may well observe – the file characters are already in ASCII format. I have attached a sample of the original input file (Orig_Text_File – Figure 1. )

    2. Relevant data extracted from the file is formatted and placed in records of six types: type 01 – 06. Record type 1 - which I will use as an example - is formatted in accordance with the specifications outlined in the attached document (Rec_01_Format: Figure 2.)

    3. In accordance with specifications of the original file:

    a. Column 01 – 02 - will indicate record type => 01

    b. Column 03-08 - will indicate clearance period
    => September 2002 [yyyymm = 200209]

    c. Column 09-10 – designator of transmitting member, who is
    GMG airlines => Dm = Z5

    d. Col. 11 => blank

    e. Col. 12-14 => numeric code of transmitter - GMG airlines
    => 009

    f. Col. 15-16 => Designator of creditor member, GMG airlines
    =>Z5

    g. Col. 17 => Blank

    h. Col. 18-20 => numeric code of creditor member => 009


    The output will be as indicated in the output document attached (ASCII_output: Figure 1. )

    An idea on how to approach this would go a long way in assiting. Especially on how to extract the information from the file. I beleive the output data can be reresented using the struct type.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    1. Don't use a document format which only a subset of the world can use, no matter how large you imagine that subset to be.

    2. The way you extract information from such a line is by using strncpy
    Code:
    char date[7];  // 6 chars + 1 for the \0
    
    // copy columns 3 to 8
    strncpy( date, &buff[2], 6 );
    date[6] = '\0';
    The FAQ explains how to read a file line by line
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Registered User
    Join Date
    Oct 2004
    Posts
    8
    Thanks for the tip Salem. I don't understand your first point about the document format. Are you referring to the documents I attached? And I am not sure how the function would work as you have it. The procedure is this: From the input file -> I am searching for a string parameter called clearance month (It appears on the file as: "CLEARANCE MONTH September 2002" - ref to Orig_TXT_File.doc). I go into file and search for "September 2002" and convert it to the format 200209 which is what is written to the record. Do you not reckon the best way to go about is define a string array, say char Month9[] = "September 2002" and the search returns TRUE => write out 200209 to the record file

  9. #9
    Registered User
    Join Date
    Oct 2004
    Posts
    8
    Thanks for the tip Salem. I don't understand your first point about the document format. Are you referring to the documents I attached? And I am not sure how the function would work as you have it. The procedure is this: From the input file -> I am searching for a string parameter called clearance month (It appears on the file as: "CLEARANCE MONTH September 2002" - ref to Orig_TXT_File.doc). I go into file and search for "September 2002" and convert it to the format 200209 which is what is written to the record. Do you not reckon the best way to go about is define a string array, say char Month9[] = "September 2002" and the search returns TRUE => write out 200209 to the record file

  10. #10
    ---
    Join Date
    May 2004
    Posts
    1,379
    thanks, skwattakamp, for posting that twice. I didnt read it the first time
    .doc formats use special formatting. Different programs use and read the formatting differently. Try opening a .doc with formatting in notepad and you will see what he meant.

  11. #11
    Registered User
    Join Date
    Oct 2004
    Posts
    8
    Ooops! I'm on a slow connection redsand, double clicked by mistake. I get his point: attach .txt files instead. I am trying to do a string search in a .txt file. Strrchr doesn't seem to work for me.

  12. #12
    Registered User
    Join Date
    Jun 2004
    Posts
    722
    Quote Originally Posted by sand_man
    thanks, skwattakamp, for posting that twice. I didnt read it the first time
    .doc formats use special formatting. Different programs use and read the formatting differently. Try opening a .doc with formatting in notepad and you will see what he meant.

    Try opening it with WordPad...
    See the diference?? Notepad reads byte by byte and presents byte by byte the whole file. WordPad or almost any other document editor, not just text editor, would read the file and parse the text formating. ms-doc files exist for a very long time,therefore, I think, it would be very hard someone not to have software compatible with it.

  13. #13
    ATH0 quzah's Avatar
    Join Date
    Oct 2001
    Posts
    14,826
    Quote Originally Posted by xErath
    ms-doc files exist for a very long time,therefore, I think, it would be very hard someone not to have software compatible with it.
    Yes, because it's so very hard to simply save as a plain text file.

    For that matter, let's take your analogy further: "MS DOS has been around for a very long time. Therefore, I think it would be very hard to have software not compatible with it." Or perhaps: ".PCX" has been around for a very long time, therefore, I think it would be very hard to have software not compatible with it." ".ARJ has..." ".ARC has..."

    See the point? There are tons of file formats that have been around for a very long time. You will soon learn that not everyone here uses Windows. As a matter of fact, had you been paying attention to this thread, you would have already learned that.

    vim does not like .doc files. I use vim. Thantos uses vim. Lots of people that are likely to post here more than the average windows user use vim or some other pure-text editor.

    Quzah.
    Hope is the first step on the road to disappointment.

  14. #14
    Registered User
    Join Date
    Oct 2004
    Posts
    8
    I have attached the file readfile.txt to show the code i have developed so far for searching for a prticular text in the file form1.txt (previously attached). I have defined two fuctions within main. the first one "srchstr" invokes the next "compare" which does a character by character comparison of two strings and returns a FLAG. The code does not seem to work. The file does open but i don't know why it does not execute the string search. Please assist!!!!

  15. #15
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    The code you attached shouldn't even compile considering you're calling srchstr() like this:
    Code:
       srchstr(cntrlmonth, month9);
    when there is no cntrlmonth declared in your program.

    Not to mention that STRING should be a char * instead of a char.
    Last edited by itsme86; 11-13-2004 at 08:17 AM.
    If you understand what you're doing, you're not learning anything.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Formatting the contents of a text file
    By dagorsul in forum C++ Programming
    Replies: 2
    Last Post: 04-29-2008, 12:36 PM
  2. Can we have vector of vector?
    By ketu1 in forum C++ Programming
    Replies: 24
    Last Post: 01-03-2008, 05:02 AM
  3. Dikumud
    By maxorator in forum C++ Programming
    Replies: 1
    Last Post: 10-01-2005, 06:39 AM
  4. Simple File encryption
    By caroundw5h in forum C Programming
    Replies: 2
    Last Post: 10-13-2004, 10:51 PM
  5. Outputting String arrays in windows
    By Xterria in forum Game Programming
    Replies: 11
    Last Post: 11-13-2001, 07:35 PM