C Board  

Go Back   C Board > General Programming Boards > C Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 03-26-2003, 07:09 PM   #1
Registered User
 
daluu's Avatar
 
Join Date: Dec 2002
Posts: 42
Need help fixing bugs in data parsing program

Hi,

I'm working on a lab project to parse data into its proper format and output to screen & file. However, I have encountered some bugs where I can't determine their origin. I've modularly tested & reviewed my parsing functions & they seem ok. I will modularly review the output functions but in the meantime, I think they are supposed to be ok.

The lab project info is at http://www.engr.sjsu.edu/~mrobins/ce130lab2.html
Source/Raw data file is at http://www.engr.sjsu.edu/~mrobins/lab2sampleu.txt
Source code listed below & also available at http://www.engr.sjsu.edu/daluu/showdata.txt

I need to find where my bugs are coming from to fix them. There are 2 major bugs: 1) output different from intended output and 2) output not sent to output file for some reason, though my code should be correct, output only sent to screen. However, the output file was created successfully, but why won't my program write to it.

Here's the output bug
--------------------
//Intended Output:
1: N87BC |Prel |2/28/03 |Deer Valley |AZ |Creitz, Robe|Nonfatal|Part 91 |General Avia
2: CP-188|Prel |2/28/03 |COLCHANI |Bolivia |Cessna 411 |Fatal(1)|NSCH Non-U.S|Commercial T
3: HC-BMD|Fact |1/17/03 |Quito |Ecuador |Fokker F28 |Nonfatal|SCHD Non-U.S|Commercial T

//Actual Output:
1: N87BC |Prel |2/28/03 |Deer Valley |AZ |Creitz, Robe|Nonfatal|Part 91 |General Avia

2: N87BC |Prel |2/28/03 |Deer Valley |AZ |Creitz, Robe|Nonfatal|Part 91 |General Avia
CP-188|Prel |2/28/03 |COLCHANI |Bolivia |Cessna 411 |Fatal(1)|NSCH Non-U.S|Commercial T

3: N87BC |Prel |2/28/03 |Deer Valley |AZ |Creitz, Robe|Nonfatal|Part 91 |General Avia
CP-188|Prel |2/28/03 |COLCHANI |Bolivia |Cessna 411 |Fatal(1)|NSCH Non-U.S|Commercial T
HC-BMD|Fact |1/17/03 |Quito |Ecuador |Fokker F28 |Nonfatal|SCHD Non-U.S|Commercial T
------------------------

It seems the previous output is kept in memory & new output is appended to it when that shouldn't be the case. But I can't find where that happens in my code as the previous output should have been flushed out for new output.

---------
Code:
#include <stdio.h>
#include <string.h>

//3 Function prototypes
void parse_input(unsigned char ch);
//Input: one byte/char extracted from file
//Output: fields stored in memory for output
//Passes input along to subfunctions to parse data into fields
void parse_output();
//Input: fields stored in memory
//Output: fields combined into one long char. string for output to screen & file
void outscr();
//Input: Record char string
//Output: Outputs 1 data record per line with RRN & "|" delimiter per field to screen
//Assumes user's screen width set to accept 120 characters wide
void outstor();
//Input: Record char string
//Output: Outputs 1 data record per line in output file, "data.txt"
void ini_rec();
//Initializes/resets the data extraction record structure for next record/set of fields
long FileSize (FILE *stream);
//Input: input file stream
//Output: returns input file size

//Subfunctions of parse_input() -> Individual field parsing functions
//Takes input from parent & parses it to appropriate field to store in memory
void parse_f1(unsigned char ch);
void parse_f2(unsigned char ch);
void parse_f3(unsigned char ch);
void parse_f4(unsigned char ch);
void parse_f5(unsigned char ch);
void parse_f6(unsigned char ch);
void parse_f7(unsigned char ch);
void parse_f8(unsigned char ch);
void parse_f9(unsigned char ch);


//Declare global variables
//Global variables used to avoid passing variables into functions
FILE *infile; //file handle for input file
FILE *outfile; //file handle for output file
long flength; //var used to calculate filesize of input file
//var "byte" indicates byte offset for comparison with filesize
int byte = 0, rrn = 0; //var "rrn" self explanatory
//positioning variables
//var "k" for array position to hold field char
// var "fnum" indicates field number to parse, default is 1
//var "atrec" indicates when one set of fields or a record has been reached for output
int k = 0, fnum = 1, atrec = 0; 

//Data extraction record structure
typedef struct record{ // 1 record contains these 9 fields
     char status[6];
     char date[9];
     char location[13];
     char state_country[13];
     char make_model[13];
     char arn[7];
     char severity[9];
     char op_type[13];
     char car_name[13];
}Record;

Record the_record; //initialize a record struct for use

//Data output record structure
//One long char string, each, to hold 1 record
char outdata[97]; //for output to file
char scrdata[97]; //for output to screen

int main(int argc,char *argv[])
{
     //initialize certain variables when 1st run program.
     unsigned char ch;

     if (argc<2)
     {
          puts("No input file specified. Please type command again with input filename.\n");
          return 0;//Self explanatory
     }
     
     //Open file handle to input file
     infile = fopen(argv[1],"rb");
     
     //Abort program & display error if file open fails
     if(infile==NULL)
     {
          printf("Error opening %s\n",argv[1]);
          return 1;
     }

     //Create file handle to output file
     outfile = fopen("datafile.txt", "wb");
     
     //Abort program & display error if file open fails
     if(outfile==NULL)
     {
          printf("Error creating datafile.txt\n");
          return 1;
     }
     
     //Calculate filesize
     flength = FileSize(infile);

     //Initialize record struct for input
     ini_rec();

     //Extract data until EOF reached
     while((ch=fgetc(infile))!= EOF){
          if(atrec){ //output when 1 record reached
               parse_output();
               outscr();
               outstor();
               ini_rec();
               atrec = 0;
          }
          byte++; //increment byte offset indicator
          parse_input(ch);
     }
     //output last record
     parse_output();
     outscr();
     outstor();

     //close file handles when done reading & writing file
     fclose(infile);
     fclose(outfile);
     
     return 0;
}

void parse_input(unsigned char ch){
     switch(fnum){ //Route to subfunction based on fnum
     case 1:
          parse_f1(ch);
          break;
     case 2:
          parse_f2(ch);
          break;
     case 3:
          parse_f3(ch);
          break;
     case 4:
          parse_f4(ch);
          break;
     case 5:
          parse_f5(ch);
          break;
     case 6:
          parse_f6(ch);
          break;
     case 7:
          parse_f7(ch);
          break;
     case 8:
          parse_f8(ch);
          break;
     case 9:
          parse_f9(ch);
          break;
     default: 
          parse_f1(ch); //default is route to parse field 1
          break;
     }
     return;
}

void parse_f1(unsigned char ch){
     if(byte == flength) return; //discard last byte due to use of unsigned char
     if(ch == 13) return;
     if(ch == 10){
          fnum = 2;
          k = 0;
          return;}
     if(k < 5) the_record.status[k++] = ch;
     return;
}

void parse_f2(unsigned char ch){
     if(ch == 13) return;
     if(ch == 10){
          fnum = 3;
          k = 0;
          return;}
     if(k < 8) the_record.date[k++] = ch;
     return;
}

void parse_f3(unsigned char ch){
     if(ch == 44){
          fnum = 4;
          k = 0;
          return;}
     if(k < 12) the_record.location[k++] = ch;
     return;
}

void parse_f4(unsigned char ch){
     if(!k){
          if(ch == ' ') return;}
     if(ch == 13) return;
     if(ch == 10){
          fnum = 5;
          k = 0;
          return;}
     if(k < 12) the_record.state_country[k++] = ch;
     return;
}

void parse_f5(unsigned char ch){
     if(ch == 13) return;
     if(ch == 10){
          fnum = 6;
          k = 0;
          return;}
     if(k < 12) the_record.make_model[k++] = ch;
     return;
}

void parse_f6(unsigned char ch){
     if(ch == 13) return;
     if(ch == 10){
          fnum = 7;
          k = 0;
          return;}
     if(k < 6) the_record.arn[k++] = ch;
     return;
}

void parse_f7(unsigned char ch){
     if(ch == 13) return;
     if(ch == 10){
          fnum = 8;
          k = 0;
          return;}
     if(k < 8) the_record.severity[k++] = ch;
     return;
}

void parse_f8(unsigned char ch){
     if(ch == 13) return;
     if((ch == 44) || (ch == 58) || (ch == 10)){
          fnum = 9;
          k = 0;
          return;}
     if(k < 12) the_record.op_type[k++] = ch;
     return;
}

void parse_f9(unsigned char ch){
     if(!k){
          if(ch == ' ') return;}
     if(ch == 13) return;
     if(ch == 10){
          fnum = 1;
          k = 0;
          atrec = 1;
          rrn++;
          return;}
     if(k < 12) the_record.car_name[k++] = ch;
     return;
}

void parse_output(){
     strcat(scrdata, the_record.arn);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.status);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.date);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.location);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.state_country);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.make_model);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.severity);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.op_type);
     strcat(scrdata, "|");
     strcat(scrdata, the_record.car_name);
     strcat(scrdata, "\n");
     for(int i = 0; i < 96; i++){
          if(scrdata[i] == '|') outdata[i] = ' ';
          else outdata[i] = scrdata[i];
     }
     outdata[96] = '\0';
     return;
}

/* old output parsing method
void parse_output(){
     int i;
     for(i = 0; i < 6; i++) scrdata[i] = the_record.arn[i];
     scrdata[6] = '|';
     for(i = 7; i < 12; i++) scrdata[i] = the_record.status[i];
     scrdata[12] = '|';
     for(i = 13; i < 21; i++) scrdata[i] = the_record.date[i];
     scrdata[21] = '|';
     for(i = 22; i < 34; i++) scrdata[i] = the_record.location[i];
     scrdata[34] = '|';
     for(i = 35; i < 47; i++) scrdata[i] = the_record.state_country[i];
     scrdata[47] = '|';
     for(i = 48; i < 60; i++) scrdata[i] = the_record.make_model[i];
     scrdata[60] = '|';
     for(i = 61; i < 69; i++) scrdata[i] = the_record.severity[i];
     scrdata[69] = '|';
     for(i = 70; i < 82; i++) scrdata[i] = the_record.op_type[i];
     scrdata[82] = '|';
     for(i = 83; i < 95; i++) scrdata[i] = the_record.car_name[i];
     scrdata[95] = 10;
     scrdata[96] = '\0';
     for(i = 0; i < 96; i++){
          if(scrdata[i] == 124) outdata[i] = ' ';
          else outdata[i] = scrdata[i];
     }
     outdata[96] = '\0';
     return;
}
*/

void outscr(){
     printf("%d: ",rrn);
     puts(scrdata);
     return;
}

void outstor(){
     fputs(outdata,outfile);
     return;
}

void ini_rec(){
     int i;
     for(i = 0; i < 5; i++) the_record.status[i] = ' ';
     the_record.status[5] = '\0';
     for(i = 0; i < 8; i++) the_record.date[i] = ' ';
     the_record.date[8] = '\0';
     for(i = 0; i < 12; i++) the_record.location[i] = ' ';
     the_record.location[12] = '\0';
     for(i = 0; i < 12; i++) the_record.state_country[i] = ' ';
     the_record.state_country[12] = '\0';
     for(i = 0; i < 12; i++) the_record.make_model[i] = ' ';
     the_record.make_model[12] = '\0';
     for(i = 0; i < 6; i++) the_record.arn[i] = ' ';
     the_record.arn[6] = '\0';
     for(i = 0; i < 8; i++) the_record.severity[i] = ' ';
     the_record.severity[8] = '\0';
     for(i = 0; i < 12; i++) the_record.op_type[i] = ' ';
     the_record.op_type[12] = '\0';
     for(i = 0; i < 12; i++) the_record.car_name[i] = ' ';
     the_record.car_name[12] = '\0';
     return;
}

long FileSize (FILE *stream)
{
     long length;//temp variable
     fseek (stream, 0L, SEEK_END);//seek to EOF
     length = ftell(stream);//store EOF byte position which is filesize
     //reset file for future reading by seeking to origin
     fseek (stream, 0L, SEEK_SET);
     return length;//return filesize
}
---------
I know debugging is one of the worst parts of programming BUT I can't do it all myself, so I really need your help here to find my bugs. All help appreciated.

NOTE: The source code is some length but I've commented half the code. If you've looked over the project info, I've used the decimal equivalents to check for ASCII delimiters (",", ":") and HEX values for line feed/new line & carriage return in my input parsing subfunctions. And data is truncated while parsing input. If you have any questions about my code, let me know.
daluu is offline   Reply With Quote
Old 03-26-2003, 09:33 PM   #2
+++ OK NO CARRIER
 
quzah's Avatar
 
Join Date: Oct 2001
Posts: 10,634
Not to sound harsh, but this is really horrible code.

You should avoid the globals, it detracts from readability.

Additionally, unless you have a C99 compiler, this is invalid:
Code:
     for(int i = 0; i < 96; i++){
I haven't read the C99 standard, but I believe it is legal there? Someone feel free to correct me here. The chances are, your compiler is probably not a C99 compiler, so unless you're compiling this as C++, this won't compile.

You might want to try flushing your output streams.
Code:
fflush( outfile );
Other than that, I didn't feel up to rewriting it so it was readable. Your single-indenting is really unreadable.

Quzah.
__________________
Hundreds of thousands of dipshits can't be wrong.


Are you up for the suck?
quzah is offline   Reply With Quote
Old 03-27-2003, 06:29 AM   #3
Registered User
 
Join Date: Mar 2003
Posts: 169
You need to initialise scrdata before using it, otherwise strcat will append to the end of the last record.
Code:
void parse_output(){
    scrdata[0] = 0;
    strcat(scrdata, the_record.arn);
    ...
    ...
    ...
}

Last edited by Scarlet7; 03-27-2003 at 07:01 AM.
Scarlet7 is offline   Reply With Quote
Old 03-27-2003, 03:17 PM   #4
Registered User
 
daluu's Avatar
 
Join Date: Dec 2002
Posts: 42
thanks, I will try your suggestions.

Quzah,

If I flush the output file, where should I place that code?

And a few other comments/questions:

Would it be simpler/better to eliminate the record structure & just parse the input into the scrdata array using several int type indexes & then make a modified copy to outdata array?

I learned C++ first, so I was used to declaring int i = 0 within for loops. Didn't know that doesn't work for C. Oh & I using GCC compiler on Solaris. Source code is named *.cpp

My lousy indenting was to save space, otherwise, code would be pages longer.
daluu is offline   Reply With Quote
Old 03-27-2003, 03:48 PM   #5
Registered User
 
daluu's Avatar
 
Join Date: Dec 2002
Posts: 42
I forgot to mention my program hangs when outputting data, it stops before outputting the last record or line. So I would guess the file write operation was not performed as the program terminated abnormally & the file output was in the memory buffer the whole time.

I fixed the screen output error but still need help to find out why program gets stuck outputting last record.
daluu is offline   Reply With Quote
Old 03-27-2003, 04:00 PM   #6
End Of Line
 
Hammer's Avatar
 
Join Date: Apr 2002
Posts: 6,240
>>my program hangs
This normally means an infinite loop of some sort. Time for you to get debugging, just add a few printf()'s in appropriate places to help you see where the program is at.
__________________
When all else fails, read the instructions.
If you're posting code, use code tags: [code] /* insert code here */ [/code]
Hammer is offline   Reply With Quote
Old 03-27-2003, 05:30 PM   #7
Registered User
 
daluu's Avatar
 
Join Date: Dec 2002
Posts: 42
ok, Hammer, I will try that. Though I don't know where I should put the debug statements exactly. Trial & error I guess.

If it were infinite looping, that seems interesting though as I only have a few for loops which are self contained like
Code:
int i;
for(i = 0; i < 96; i++){...}
so the only other loop would be the while loop:
Code:
while((ch=fgetc(infile))!= EOF){...}
I don't suppose the while loop is the problem is it?
daluu is offline   Reply With Quote
Old 03-27-2003, 05:58 PM   #8
Registered User
 
daluu's Avatar
 
Join Date: Dec 2002
Posts: 42
much thanks for your hint, Hammer, it was the while loop that looped infinitely. And the compiler warned me about that too but I didn't figure that to be true.

Here's the thing though, the while loop works ok when the byte is retrieved as an int type as I did with a hexdump program. But why does it loop infinitely when the type is changed to unsigned char? is unsign giving it problems or char? I would like to know why.

I fixed problem by doing this instead:

Code:
unsigned char ch;
while(!feof(infile)){
     ch = fgetc(infile);
     ...
}
daluu is offline   Reply With Quote
Old 03-27-2003, 06:02 PM   #9
End Of Line
 
Hammer's Avatar
 
Join Date: Apr 2002
Posts: 6,240
>>while(!feof(infile))
Nooooooooo


Change ch to be an int, not an unsigned char, and go back to your original while loop.
__________________
When all else fails, read the instructions.
If you're posting code, use code tags: [code] /* insert code here */ [/code]
Hammer is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
xor linked list adramalech C Programming 23 10-14-2008 10:13 AM
Binary Search Trees Part III Prelude A Brief History of Cprogramming.com 16 10-02-2004 03:00 PM
How to complete program. Any probs now? stehigs321 C Programming 7 11-19-2003 04:03 PM
Warnings, warnings, warnings? spentdome C Programming 25 05-27-2002 06:49 PM


All times are GMT -6. The time now is 11:36 PM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22