You may just want to post your parser. Ideas abound, but it really depends on the design of your parser.
Well, I have a function that grabs the next token (to skip comments). Next float uses this function to get a token to process.
Alternatively, why aren't you opening the stream in binary mode and using a manually coded "whitespace" cruncher for newlines setting the state of a variable to track consecutive space? (Then using `fseek' adjusting by that known value.)
I am very interested to know more about this. Do you have some links to resources that deal with this kind of thing?
For example, if your "floating point value" element parser isn't supposed to read values that can't be interpreted as a valid floating point value, why does it read into "Test" string in the first place?
Because Next_Token finds the next spaced value and if I use Next_Float when the next token is something like '123.123*', Next_Float would return FAILURE. This is the reason I thought a simple solution would be to move back the characters that are invalid.
Here is the code for Next_Float:
Code:
int Next_Float
(
FILE *Input,
const char *Comment_Prefix,
char *Buffer,
const int Buffer_Size,
float *Target,
int *Skipped_Lines
){
/* Get the next token */
if(Target == NULL || !Next_Token(Input, 1, Comment_Prefix, Buffer, Buffer_Size, Skipped_Lines))
return FAILURE;
/* Temporary character (using during fseek) */
char Temp;
/* Loop through the token stored in Buffer */
for(int k = strlen(Buffer), i = 0, j = 0;i < k;i++)
/* Check if the characters in the token are digits with up to
one occurence of a decimal character and one dash or plus character at the start */
if(Buffer[i] < 45 || Buffer[i] > 57 || Buffer[i] == 47 || j > 1 || ((Buffer[i] == '-' || Buffer[i] == '+') && i != 0)){
/* Test for the following cases (where X is any non digit) */
if(i == 0 /* 'X' */
||(i == 1 && (Buffer[0] == '-' || Buffer[0] == '+' || Buffer[0] == '.')) /* '-X', '+X', '.X' */
||(i == 2 && ((Buffer[0] == '-' || Buffer[0] == '+') && Buffer[1] == '.'))){ /* '-.X', '+.X' */
/* Move the file pointer back, the whole token is useless as number */
if(Temp = fgetc(Input) && (Temp == '\n' || Temp == 10))
fseek(Input, -k + 1, SEEK_CUR);
else
fseek(Input, -k, SEEK_CUR);
/* Make Buffer null */
Buffer[0] = '\0';
/* Return FAILURE because the 0 through i tokens in Buffer are not numbers */
return FAILURE;
}
/* Move the file pointer back because a portion of the read token is not part of the number */
if(Temp = fgetc(Input) && (Temp == '\n' || Temp == 10))
fseek(Input, -k + (i - 1), SEEK_CUR);
else
fseek(Input, -k + (i - 2), SEEK_CUR);
/* Terminate the string at the end of the number */
if(*Skipped_Lines > 0)
printf("-Next_Line-\n");
printf(" The fseek value: %d. Token to be processed: ", - k + (i - 1));
printf(" %s. Next Token: ", Buffer);
Buffer[i] = '\0';
/* Exit the loop (we are done testing the token in the buffer) */
break;
}else if(Buffer[i] == '.')
j++;
/* Convert the valid float token and return */
*Target = atof(Buffer);
return SUCCESS;
}
Here is the code for Next_Token:
Code:
int Next_Token
(
FILE *Input,
const int Number_Of_Tokens,
const char *Comment_Prefix,
char *Buffer,
const int Buffer_Size,
int *Skipped_Lines
){
/* Temporary value for a possible token character */
char Possible_Token;
/* Pointer used by strstr to point at a possible comment */
char *Comment_Start;
/* Preprocessed length of Comment_Prefix */
int Comment_Length = strlen(Comment_Prefix);
/* Loop down from the number of tokens needed to zero */
int i = 0;
/* Set Skipped_Lines to 0 */
if(Skipped_Lines != NULL)
*Skipped_Lines = 0;
for(int j = Number_Of_Tokens;j > 0;j--){
/* Skip whilespace */
do{
if((Possible_Token = fgetc(Input)) == EOF)
return FAILURE;
/* Increment Skipped_Lines if a new line character is found */
if(Skipped_Lines != NULL && (Possible_Token == 10 || Possible_Token == '\n'))
*Skipped_Lines = *Skipped_Lines + 1;
}while(Possible_Token < 33);
/* Add characters to token until end of file or a white space */
do{
/* Assign the current possible and terminate the string */
if(i < Buffer_Size)
Buffer[i++] = Possible_Token;
else{
/* If the buffer is not large enough, skip to next token and return failure */
SKIP_TOKEN(Input);
return FAILURE;
}
/* Test for a comment */
if(i >= Comment_Length && i < Buffer_Size && strncmp(&Buffer[i - Comment_Length], Comment_Prefix, Comment_Length) == 0){
/* Skip the line and increment Skipped_Lines if it is not null */
SKIP_TO_END_OF_LINE(Input);
/* Set index of the pointer to i */
i -= Comment_Length;
/* If no token was found before the comment, correct j */
if(i < 1 || Buffer[i - 1] == ' ')
j++;
/* Comment means the end of a token, so exit the loop */
break;
}
/* Get the next character. If it is the end of file and the token goal met, return success */
if((Possible_Token = fgetc(Input)) == EOF && j == 1)
return SUCCESS;
}while(Possible_Token > 32);
/* Move the file pointer back incase the last character was a new line */
if(Possible_Token < 32)
ungetc(Possible_Token, Input);
/* If no words were added to Buffer (meaning a comment was found) and the number of tokens to find is greater than one, add a space */
if(j != 1 && i > 0 && i < Buffer_Size && Buffer[i - 1] != ' ')
Buffer[i++] = ' ';
}
/* Terminate the string and return */
Buffer[i] = '\0';
return SUCCESS;
}
EDIT:
Here are the macros i used:
Code:
#define SKIP_LINE(in) fscanf(in, "%*[^\n]%*c")
#define SKIP_TOKEN(in) fscanf(in, "%*[^\n\t ]")
#define SKIP_TO_END_OF_LINE(in) fscanf(in, "%*[^\n]")