Thread: Parsing a text file using C program

  1. #16
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Yes, I know you have a lot of lines of data. Don't worry about the number of lines of data now, however. That's a detail!

    Right now, get ONE line of test data, to work correctly, with all types of your data - including negative numbers, and all the rest. While you're developing your program, it's best to put off details - ALL the non-critical details (one of the key parts of top down design, btw).

    Eliminate all the user input that you reasonably can, and do it later. We need to get the overall logic and the flow of that logic (which part goes first, second, etc.), worked out.

    This is what I would do for a char by char design:

    fgets() to get a full line of data

    Use your char by char input to put each number into a small dataOne char array, that is big enough to hold one number, maybe 30 char's.

    You will start with data[0] and your number will stop when you reach either a comma, or a newline: ('\n'), (you won't see the newline but it will be on the end of every line of text, and in your data[] array.)

    Now set up your "walk" through the dataOne[] array, to see if it has a decimal point.

    If it has NO decimal point, then you can write out the dataOne number, followed by a comma and a space, into the new file. If it's the last number in the row (you reach the \n char), then do not write out the comma, just the newline char.

    else it has a decimal point, then follow the logic I posted just above this, to round off the 6th digit after the decimal point. When it's done rounding off that number, write it out to the file, just like all the other numbers.

    For right now, just write it out to the screen, because it's WAY faster to see if it's right or wrong. Speeds up everything, and that's VERY important. Writing code can take WAY LONG, if you don't take every speed up in the process, that you can find.

    This is the lines of code that I've been working with, for preliminary testing:

    0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998
    -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998, 0
    1.0000004, -1.0000001, 28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998,0, -2.3999997
    28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998
    17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28
    -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3
    555.9999990, 555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567
    555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567, 555.9999990
    -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567, 555.9999990, 555.9999995
    -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567, 555.9999990, 555.9999995, -555.9999990

    Lots of decimal numbers, both positive and negative, and in different positions. Easy to tell if they do/don't round off.

    Dig in here! You should be tossing out very specific questions, and you aren't.

  2. #17
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Quote Originally Posted by Adak View Post
    Yes, I know you have a lot of lines of data. Don't worry about the number of lines of data now, however. That's a detail!

    Right now, get ONE line of test data, to work correctly, with all types of your data - including negative numbers, and all the rest. While you're developing your program, it's best to put off details - ALL the non-critical details (one of the key parts of top down design, btw).

    Eliminate all the user input that you reasonably can, and do it later. We need to get the overall logic and the flow of that logic (which part goes first, second, etc.), worked out.

    This is what I would do for a char by char design:

    fgets() to get a full line of data

    Use your char by char input to put each number into a small dataOne char array, that is big enough to hold one number, maybe 30 char's.

    You will start with data[0] and your number will stop when you reach either a comma, or a newline: ('\n'), (you won't see the newline but it will be on the end of every line of text, and in your data[] array.)

    Now set up your "walk" through the dataOne[] array, to see if it has a decimal point.

    If it has NO decimal point, then you can write out the dataOne number, followed by a comma and a space, into the new file. If it's the last number in the row (you reach the \n char), then do not write out the comma, just the newline char.

    else it has a decimal point, then follow the logic I posted just above this, to round off the 6th digit after the decimal point. When it's done rounding off that number, write it out to the file, just like all the other numbers.

    For right now, just write it out to the screen, because it's WAY faster to see if it's right or wrong. Speeds up everything, and that's VERY important. Writing code can take WAY LONG, if you don't take every speed up in the process, that you can find.

    This is the lines of code that I've been working with, for preliminary testing:

    0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998
    -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998, 0
    1.0000004, -1.0000001, 28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998,0, -2.3999997
    28, 17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998
    17.1234567, -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28
    -17.1234567, 555.9999990, 555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3
    555.9999990, 555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567
    555.9999995, -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567, 555.9999990
    -555.9999990, -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567, 555.9999990, 555.9999995
    -555.9999998, 0, -2.3999997, 1.0000004, -1.0000001, 28, 17.1234567 3, -17.1234567, 555.9999990, 555.9999995, -555.9999990

    Lots of decimal numbers, both positive and negative, and in different positions. Easy to tell if they do/don't round off.

    Dig in here! You should be tossing out very specific questions, and you aren't.

    Thanks for the help and I do the same kind of testing too, soo even I have data like that but mine is even worst , so I need to look into each character carefully .

    Thanks

  3. #18
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Please don't keep quoting what I just posted. I know what I posted, and it takes up a lot of space.

    Thanks, and you're welcome.

  4. #19
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    My final version of this used just the two include files of stdio.h and string.h.

    variables included:
    char data[128], small[30]
    int i,j,lendata
    double real
    FILE *fp

    and a few misc one's.

    Input from the file uses the standard while((fgets(data, sizeof(data),fp)) != NULL) loop to bring in each row of data into the data[] array. lendata remember how long this row of data is, using strlen().

    From there, it's copied char by char, into the small[] array, until the char is a comma, or a newline. This is a nested for loop, so each number will be dealt with, one at a time.

    When one number has been put into the small[], it's checked by strchr() to see if it has a decimal point anywhere in it.

    If there is no decimal point, then the data is written out to the output file. else it's sscanf()'d into the real variable, and then written out using %6f so the printf() function will automatically round the real number, appropriately, with no further help.

    When the last line of data has been read in the file, the while loop exits, and the file is closed. Zero is returned to the Operating system.

    The whole program is pretty compact at only 43 lines.

  5. #20
    Registered User
    Join Date
    May 2012
    Location
    Italy
    Posts
    53
    A little tip: use always:
    Code:
    int main(void)
    neither
    Code:
    main()
    nor
    Code:
    int main()

  6. #21
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    Quote Originally Posted by polslinux
    A little tip: use always:
    Code:
    int main(void)
    If you need to access the command line arguments, then I would suggest not following this tip

    Quote Originally Posted by polslinux
    nor
    Code:
    int main()
    Unless the feature really has been removed in C11, this one is actually fine in the sense that although the feature is obsolescent, it is still valid and I find it improbable that they would remove it.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Parsing a Text File
    By C_prog in forum C Programming
    Replies: 4
    Last Post: 06-19-2012, 06:31 PM
  2. parsing through a text file
    By oopsyourhead in forum C++ Programming
    Replies: 14
    Last Post: 05-29-2012, 01:42 PM
  3. Text file parsing
    By papagaio in forum C Programming
    Replies: 7
    Last Post: 10-01-2009, 04:47 PM
  4. Help parsing text file
    By dudeomanodude in forum C++ Programming
    Replies: 7
    Last Post: 07-16-2008, 10:21 AM
  5. Text file parsing
    By Unregistered in forum C++ Programming
    Replies: 8
    Last Post: 07-25-2002, 01:17 AM

Tags for this Thread