Parsing a text file using C program

This is a discussion on Parsing a text file using C program within the C Programming forums, part of the General Programming Boards category; Hi All, I am a newbie in C programming. I was facing a problem with reading a text file and ...

  1. #1
    Registered User
    Join Date
    Aug 2012
    Posts
    10

    Parsing a text file using C program

    Hi All,

    I am a newbie in C programming. I was facing a problem with reading a text file and writing it as it is but i need to round some of the floating numbers to six decimal digits. I am posting my code as it is and example of how my text file looks like. As I am a beginner I need some suggestions and guide me into the right direction.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    
    /* data analysis for TI to sort out data by Integers, Floating Points,
       Binary numbers, strings, fields */
    
    main()
    {
        char ch, frst_100l[1024]; /* pointer ch character for file*/
        FILE *fp;  /* file pointer*/
    
      printf("frst_100l"); /* name of the file */
      gets(file_name);
     
       
      fp = fopen("frst_100l","r+");                                                                                                        ");/*open the file name*/
    
    
    
    /*  if file is error , then excute error and exit failure */
      if(fp == NULL)
       {
         printf("\n Error while openeing the file:frst_100l
    ");
            exit (EXIT_FAILURE);
       } 
    
    
    
      
    while( (Ch=getchar(fp))!=EOF) /* pointer ch to read the files 
    {
     
        fputc(ch,fp);
        
        if ( fp == '.')
        {
           fputc(ch,fp)   if(fp==','of
           fputc(ch,fp)
           fputc(ch,fp)
           fputc(ch,fp)
           fputc(ch,fp)
           fputc(ch,fp)
        }
        if( fp == ',')
        {
          fputc(ch,fp);
        }
    fclose(fp)
    Example one line of the text file is below

    04,-150,-75,-37,-18,-8,-4,6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,660.344421386719,,334.063598632813,
    I also heard about atof function can anyone explain or educate me how that works....
    Thanks,
    Adoosa.

  2. #2
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,678

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    When I have a lot of parsing to do, I like to use fgets(myCharArray, sizeof(myCharArray), filePointer), to put each line of text, into the char array. Then work with the parsing, in the char array. Much easier than doing it from the file, (usually).

    Are you using the comma's and whether the string of digits that make up the number has a leading 0 or a decimal point, to assist the parsing logic?

    If you can be more specific with your questions, we can be more specific about answers.

  4. #4
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Quote Originally Posted by Adak View Post
    When I have a lot of parsing to do, I like to use fgets(myCharArray, sizeof(myCharArray), filePointer), to put each line of text, into the char array. Then work with the parsing, in the char array. Much easier than doing it from the file, (usually).

    Are you using the comma's and whether the string of digits that make up the number has a leading 0 or a decimal point, to assist the parsing logic?

    If you can be more specific with your questions, we can be more specific about answers.


    To be more specific , I can make bullet points which is more clear:
    • The commas, integers, binary numbers, whole numbers should all be copied the same way as they are in the file.
    • Only when the pointer reaches a decimal number or floating point number is the only change , that is that decimal number should be rounded to six significant figures.
    • Everything else should be copied the same way in the file, the only change is to the decimal number.
    • so i new file can be created which copies everything thats in the previous file and only change the floating point number to rounded six significant figure

    Please let me know , if it helps or if there are any more questions .

    Thank You
    Ruthy

  5. #5
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <conio.h>
    #include <string.h>
    
    /* data analysis for TI to sort out data by Integers, Floating Points,
       Binary numbers, strings, fields */
    
    main()
    {
        char ch, frst_100l[1024]; /* pointer ch character for file*/
        FILE *fp;  /* file pointer*/
    
      printf("frst_100l"); /* name of the file */
      gets("frst_100l");
     
       
      fp = fopen("c:\\frst_100l.txt","r+");                                                                                                        ");/*open the file name*/
    
    
    
    /*  if file is error , then excute error and exit failure */
    
    
    if((fp = fopen("frst_100l.txt","r+")) == NULL)
     printf("File open no successful.\n");
    else
    {
      while (fscanf(fp,"%f",&ch)!= EOF)
         fprintf(fp,"%.6lf",ch);
      
     if (fclose(fp) == EOF)
       printf("File close not successful.\n");
    }
    return 0;
    }
    I made some improvements to this .

  6. #6
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    OK, every line of data that has no number with a decimal point, can be written out exactly as it is -- fgets() will do that exactly for you.

    Every line WITH a decimal point in it, needs to be rounded off at the sixth significant digit - that's also easy to do, with fgets().

    Say I have this line of data:

    10, -180, 3.14159265358979, 11.899

    resulting in a data array of:
    Code:
    char data[124]={10, -180, 3.14159265358979, 11.899};
    How would you round off the third number, as needed?

    Clearly, if you use a for loop, you can check each digit, and locate the decimal point, and get it's index location. Also clearly, you can then use that index + 6, to locate the last significant digit.

    I can't emphasize this enough: I would assume that your newest code is an improvement - that is NO help to us. What I need to help you, is WHAT PROBLEM YOU ARE HAVING WITH THAT CURRENT CODE - SPECIFICALLY!

    I'm not going to run your code against an assortment of possible data files to figure out what you should be telling me. We need to be efficient here, and not waste time.
    Last edited by Adak; 08-27-2012 at 10:52 PM.

  7. #7
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Quote Originally Posted by Adak View Post
    OK, every line of data that has no number with a decimal point, can be written out exactly as it is -- fgets() will do that exactly for you.

    Every line WITH a decimal point in it, needs to be rounded off at the sixth significant digit - that's also easy to do, with fgets().

    Say I have this line of data:

    10, -180, 3.14159265358979, 11.899

    resulting in a data array of:
    Code:
    char data[124]={10, -180, 3.14159265358979, 11.899};

    Thank you , so coming to my new code the build is fine but when I debug , it gives me a run time error and nothing else , the program just freezes , it doesn't go through the code I guess. As far as I think I'm definitely not doing the rounding rite or for a pointer to scan the file. Can you please tell me what I need to add to it .
    because my file is a data of about 100 lines, so I need it to go through every line and just change the decimal point to rounded 6 significant digits.

    Can you emphasize a little more on what you explained .

    Thank you

    How would you round off the third number, as needed?

    Clearly, if you use a for loop, you can check each digit, and locate the decimal point, and get it's index location. Also clearly, you can then use that index + 6, to locate the last significant digit.

    I can't emphasize this enough: I would assume that your newest code is an improvement - that is NO help to us. What I need to help you, is WHAT PROBLEM YOU ARE HAVING WITH THAT CURRENT CODE - SPECIFICALLY!

    I'm not going to run your code against an assortment of possible data files to figure out what you should be telling me. We need to be efficient here, and not waste time.

    Thank you , so coming to my new code the build is fine but when I debug , it gives me a run time error and nothing else , the program just freezes , it doesn't go through the code I guess. As far as I think I'm definitely not doing the rounding rite or for a pointer to scan the file. Can you please tell me what I need to add to it .
    because my file is a data of about 100 lines, so I need it to go through every line and just change the decimal point to rounded 6 significant digits.

    Can you emphasize a little more on what you explained .

    Thank you

  8. #8
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I will show you mine, if you will show me yours!

    It's a little touchy - which is why we have round up functions in C, built in. You CAN do it yourself, and it's a great exercise to do so, but I'm not clear whether you are supposed to be doing the rounding of the number manually, or should be using the built in functions.

    My post presumed that you had to round the real number at six significant figures manually, is that correct?

    A better description, might be:

    I used fgets() to get the whole line of text into a single string, Then "walked" the string using a for loop. If one of the char's in the string is a '.', then you have a real number, so backup the index (don't use a pointer, they're just more difficult for beginners, use the array[index] [which should be a different variable-I used i to go forward, and j to back up]), using a while loop, until you get back to the last space char, before the decimal point.

    Now sscanf(array+j), with a real number format (I used double). Use a %.6 format so you can get only the right amount of digits into your variable. I divided the real number up into it's ordinal and mantissa, but that appears unnecessary. I'll look at it again when I get some time later today.

    Then round the number. I used modulo 10 to get the difference between the last digit, and 0, and if the sixth significant digit was > 4, I rounded upward by adding the (10 -difference). If it was less, I subtracted the modulo 10 number.

    Then I "peeled" off the digits from the real number, into a second string "real[40]", using a while (number>0) digit = number % 10+'0', and then putting the digit into the string real[]. Then dividing the real number, by 10, to get rid of that digit.

    After I had all the digits from the real number, put into the real[] char array, I added an end of string char: '\0' to the end of it, and then strcpy()'d it back into the original array, right back where it came from - since I still had the variable with the index of the "backup" j value.

    Then I overwrote the end of string char, that strcpy() needed, with a comma, to restore the original string of the whole line of text. You have to do this, "blind" since you can't see the '\0' char on the display of the string, but it's there. If you don't remove the extra '\0' you added, and try to print the string, you will see that the string of the entire line of data, has been truncated, at that point.

    Then the string is ready to be written out to the output file. If there was no real number, then nothing will have been changed in the line of text, but it all has to be copied into a second file -- and then the original file needs to be deleted, and the second file renamed with the original file's name.
    Last edited by Adak; 08-28-2012 at 01:46 PM.

  9. #9
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Quote Originally Posted by Adak View Post
    I will show you mine, if you will show me yours!

    It's a little touchy - which is why we have round up functions in C, built in. You CAN do it yourself, and it's a great exercise to do so, but I'm not clear whether you are supposed to be doing the rounding of the number manually, or should be using the built in functions.

    My post presumed that you had to round the real number at six significant figures manually, is that correct?

    A better description, might be:

    I used fgets() to get the whole line of text into a single string, Then "walked" the string using a for loop. If one of the char's in the string is a '.', then you have a real number, so backup the index (don't use a pointer, they're just more difficult for beginners, use the array[index] [which should be a different variable-I used i to go forward, and j to back up]), using a while loop, until you get back to the last space char, before the decimal point.

    Now sscanf(array+j), with a real number format (I used double). Use a %.6 format so you can get only the right amount of digits into your variable. I divided the real number up into it's ordinal and mantissa, but that appears unnecessary. I'll look at it again when I get some time later today.

    Then round the number. I used modulo 10 to get the difference between the last digit, and 0, and if the sixth significant digit was > 4, I rounded upward by adding the (10 -difference). If it was less, I subtracted the modulo 10 number.

    Then I "peeled" off the digits from the real number, into a second string "real[40]", using a while (number>0) digit = number % 10+'0', and then putting the digit into the string real[]. Then dividing the real number, by 10, to get rid of that digit.

    After I had all the digits from the real number, put into the real[] char array, I added an end of string char: '\0' to the end of it, and then strcpy()'d it back into the original array, right back where it came from - since I still had the variable with the index of the "backup" j value.

    Then I overwrote the end of string char, that strcpy() needed, with a comma, to restore the original string of the whole line of text. You have to do this, "blind" since you can't see the '\0' char on the display of the string, but it's there. If you don't remove the extra '\0' you added, and try to print the string, you will see that the string of the entire line of data, has been truncated, at that point.

    Then the string is ready to be written out to the output file. If there was no real number, then nothing will have been changed in the line of text, but it all has to be copied into a second file -- and then the original file needs to be deleted, and the second file renamed with the original file's name.

    we can use built in functions, doesn't have to be manual. We can use atof and I still cant understand what your trying to say.
    And only the decimal numbers are the ones that need to be rounded off nothing else.

    thank you

  10. #10
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    OK, well crap! I thought you had to do the rounding manually - which is more involved of course - so you will not use the info I posted.

    Use fgets() to put the entire line of text into a char array like
    Code:
    char data[128];
    FILE *fp;
    fgets(data, sizeof(data), fp);
    printf("data: %s \n",data);   //verify the data was stored ok.
    Now see if any data[i] have a decimal point. If they do, then back up until you reach the space between numbers, and sscanf() that ONE numbers digits into a separate variable. This is simply "walking" through the char data array, with an index, looking for a '.' char.

    Google "rounding off real numbers in C to N significant numbers", (if that doesn't work, delete the "to N significant numbers" part), to get some background on the subject.

    Post up your code that opens the file in text mode, and reads the data in, one line at a time, into the char data array. Be sure you can read through every line of text in the file (the tutorial on this board shows how to do that. Link to it is on the top of the forum).

    Try and be specific about anything that has you stumped, but you need to post some code to get this thread a-rolling.

  11. #11
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Quote Originally Posted by Adak View Post
    OK, well crap! I thought you had to do the rounding manually - which is more involved of course - so you will not use the info I posted.

    Use fgets() to put the entire line of text into a char array like
    Code:
    char data[128];
    FILE *fp;
    fgets(data, sizeof(data), fp);
    printf("data: %s \n",data);   //verify the data was stored ok.
    Now see if any data[i] have a decimal point. If they do, then back up until you reach the space between numbers, and sscanf() that ONE numbers digits into a separate variable. This is simply "walking" through the char data array, with an index, looking for a '.' char.

    Google "rounding off real numbers in C to N significant numbers", (if that doesn't work, delete the "to N significant numbers" part), to get some background on the subject.

    Post up your code that opens the file in text mode, and reads the data in, one line at a time, into the char data array. Be sure you can read through every line of text in the file (the tutorial on this board shows how to do that. Link to it is on the top of the forum).

    Try and be specific about anything that has you stumped, but you need to post some code to get this thread a-rolling.

    So , finally after going through a it a bit more, I totally changed my code and started by first creating a new file and copying data into it. Therefore I did compile and its working fine.
    Now the part is where or how do I do the rounding of six significant digits .
    Below is the code

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <math.h>
    #include <conio.h>
    
     // Program that copies file frst_100l to a new file new_data.
    
       int main()
    {
       char ch, data_in[1024], data_out[1024]; // initialize character for files
       FILE *fp, *tp;  // file pointers
     
       printf("Enter name of file to copy\n"); // enter the name of the file for data
       gets(data_in); // get data to character pointer
     
       fp = fopen(data_in, "r"); // open file to read
     
       // if file has error then exit the debug
       if( fp == NULL )
       {
          printf("Press any key to exit...\n");
          exit(EXIT_FAILURE);
       }
    
       // Enter the name of the file for the data to copy to
     
       printf("Enter name of file data should be changed to \n");
       gets(data_out); // get data_out pointer
     
       tp = fopen(data_out, "w"); // open the file to write
     
       // close if error in creating or writing the file
       if( tp == NULL ) 
       {
          fclose(fp);
          printf("Press any key to exit...\n");
          exit(EXIT_FAILURE);
       }
     
       // while character is for data_out file is not equal to end of file
       // copy the data to new file 
       while( ( ch = fgetc(fp) ) != EOF )
          fputc(ch, tp);
     
    
    
       // print file copied successfully and close the debug window
       printf("File copied successfully.\n");
     
       fclose(fp);
       fclose(tp);
     
       return 0;
    }
    Thank You
    Ruthy

  12. #12
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    OK, (that was fast!), but you don't want to just copy the data, before the data has been changed, if a number needs changing.

    Before you write out the line of text, check it it has a decimal point.

    I have some running around to do, but will be back in 3 hours.

  13. #13
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Ok thank you

  14. #14
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    As it turns out, the standard for C, is to round out the last digit of real numbers (when appropriate), so simply using it leads to a much simpler program.

    An example (this works with only the FIRST real number, in the array):
    I was rotating numbers around all over for some quick testing of the code.

    Code:
    #include <stdio.h>
    #include <string.h>
    //#include <stdlib.h>   //for atof() if needed
    
    int main(void) {
       char data[128]={"0, -12.0000003, 28.999999, 101.999999, 1.000001, 20.000001, 100.000001"};
       char real[30];
       int i,j,len,lendata,lenOrdinal,temp,decpoints=0;
       double realn;
       
       lendata=strlen(data);
       printf("%s\n\n",data);
       for(i=0,lenOrdinal=0;i<lendata;i++) {
          if(data[i]=='.') { decpoints++;
             j=i;
             while(data[--j] >='0' && j > 0) { //back up to the start of the real number.
                ++lenOrdinal;
             }
             
             //sscanf the real number, into n, giving it
             //the starting location that was just backed up to.
             printf("j: %d\n",j);
             // &data[j] is the address of the data[j] element in the data array 
             sscanf(&data[j],"%lf",&realn);
             //print out only six digits after the decimal point
             //rounding is automatic if there is a 7th digit
             printf("realn: %.6f \n",realn); 
             return 0;
          }
       }
    }
    Last edited by Adak; 08-28-2012 at 07:27 PM.

  15. #15
    Registered User
    Join Date
    Aug 2012
    Posts
    10
    Quote Originally Posted by Adak View Post
    As it turns out, the standard for C, is to round out the last digit of real numbers (when appropriate), so simply using it leads to a much simpler program.

    An example (this works with only the FIRST real number, in the array):
    I was rotating numbers around all over for some quick testing of the code.

    Code:
    #include <stdio.h>
    #include <string.h>
    //#include <stdlib.h>   //for atof() if needed
    
    int main(void) {
       char data[128]={"0, -12.0000003, 28.999999, 101.999999, 1.000001, 20.000001, 100.000001"};
       char real[30];
       int i,j,len,lendata,lenOrdinal,temp,decpoints=0;
       double realn;
       
       lendata=strlen(data);
       printf("%s\n\n",data);
       for(i=0,lenOrdinal=0;i<lendata;i++) {
          if(data[i]=='.') { decpoints++;
             j=i;
             while(data[--j] >='0' && j > 0) { //back up to the start of the real number.
                ++lenOrdinal;
             }
             
             //sscanf the real number, into n, giving it
             //the starting location that was just backed up to.
             printf("j: %d\n",j);
             // &data[j] is the address of the data[j] element in the data array 
             sscanf(&data[j],"%lf",&realn);
             //print out only six digits after the decimal point
             //rounding is automatic if there is a 7th digit
             printf("realn: %.6f \n",realn); 
             return 0;
          }
       }
    }
    can you please explain me or be more clear about how to go through for character by character . My file has 100s of lines of data and there are commas and strings, integers, decimal numbers, binary. So how can find that string between commas which identifies as a decimal number and then use the atof function to convert to six significant figures floating point number

    This is something I have come upto so far

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <process.h>
    #include <conio.h>
    
     // Program that copies file frst_100l to a new file new_data.
    
       int main()
    {
       char ch, data_in[1024], data_out[1024]; // initialize character for files
       char line[1024];
       FILE *fp, *tp;  // file pointers
       int x,k,n,counter;
       printf("Enter name of file to copy\n"); // enter the name of the file for data
       gets(data_in); // get data to character pointer
     
       fp = fopen(data_in, "r"); // open file to read
     
       // if file has error then exit the debug
       if( fp == NULL )
       {
          printf("Press any key to exit...\n");
          exit(EXIT_FAILURE);
       }
    
      while( fgets(line,1023,fp) != NULL)       
      {
      fputc(ch,fp);
         if ( ch == ".")
         {
          counter = counter;
         else
             counter = counter+1;
         for(x = counter;x<1024;x++)
             if(ch == ",")
              {
                  k = k;
             else
             k = k+1;
         for(y=counter;y>0;y--)
             if(ch == ',')
             {
                 n = n;
             else 
                n = n+1;
             }
             }
         }
    
    
      double y;
      char *s;
      s = [n+1,k-1];
     x = atof(s);
     fprintf (fp,"%.6f",x);
      }
      
    
       
     
       // Enter the name of the file for the data to copy to
     
       printf("Enter name of file data should be changed to \n");
       gets(data_out); // get data_out pointer
     
       tp = fopen(data_out, "w"); // open the file to write
     
       // close if error in creating or writing the file
       if( tp == NULL ) 
       {
          fclose(fp);
          printf("Press any key to exit...\n");
          exit(EXIT_FAILURE);
       }
     
       // while character is for data_out file is not equal to end of file
       // copy the data to new file 
       while( ( ch = fgetc(fp) ) != EOF )
          fputc(ch, tp);
     
    
    
       // print file copied successfully and close the debug window
       printf("File copied successfully.\n");
     
       fclose(fp);
       fclose(tp);
     
       return 0;
    }

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Parsing a Text File
    By C_prog in forum C Programming
    Replies: 4
    Last Post: 06-19-2012, 06:31 PM
  2. parsing through a text file
    By oopsyourhead in forum C++ Programming
    Replies: 14
    Last Post: 05-29-2012, 01:42 PM
  3. Text file parsing
    By papagaio in forum C Programming
    Replies: 7
    Last Post: 10-01-2009, 04:47 PM
  4. Help parsing text file
    By dudeomanodude in forum C++ Programming
    Replies: 7
    Last Post: 07-16-2008, 10:21 AM
  5. Text file parsing
    By Unregistered in forum C++ Programming
    Replies: 8
    Last Post: 07-25-2002, 01:17 AM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21