> because it seems ok when I tested the codes out.
You're probably OK while your input files are well-formed HTML - that is, all the <> match up.
However, you're more likely to come unstuck with a bad HTML file, which contains a <, and no following > before the end of file
Here's my idea for building your buffer
Code:
#include <stdio.h>
void validate ( char *buff ) {
printf( "%s\n", buff );
}
int main ( ) {
char buff[BUFSIZ];
int i = 0;
int ch;
while ( (ch=fgetc(stdin)) != EOF ) {
if ( ch == '<' ) {
/* found a <, now find the > */
while ( (ch=fgetc(stdin)) != EOF && ch != '>' ) {
/* also check i < BUFSIZ as well */
buff[i++] = ch;
buff[i] = '\0';
}
if ( i > 0 ) {
validate( buff );
i = 0; /* ready for the next one */
}
}
}
return 0;
}
> Use fread() instead.
fread() has no benefits over fgetc for reading text files