What about a buffer-less approach, and use a state machine?
Code:
while ( (ch=getNextChar()) ) {
switch ( ch ) {
case '<':
tagString[tagStringLen++] = ch;
state = inTag;
break;
case '>':
tagString[tagStringLen++] = ch;
tagString[tagStringLen] = '\0';
process( tagString ); // set some state on seeing <test>, clear it on seeing </test>
state = outTag;
break;
// and so on
}
}
That's a simplified view.
More generally, you would compare both 'state' and 'ch' to determine what 'newstate' should be, and perform any additional processing along the way.
Adding detection of say comments is pretty easy.