Yes, it's possible, but not necessary, to go from infix to the tree without any intermediate step. I suggest you do this with recursive descent, because the algorithm is easier to understand. If you have a right-recursive grammar such as
E->T + E | T - E | T
T->F * T | F / T | F
F->digit | (E)
You can write a lexer that will return the next token and a match function that will match a given token, call an error routine if the next token does not match, then scan the next token. At each step, the next token determines which nonterminal you call. If you have ,for instance, the expression "4 + 3$", where '$' is the EOF character, your recursive call sequence looks like as follows:
next_token == 4
E()
T()
F()
digit
match(DIGIT)
next_token == +
F() returns
T() returns
Were're now at E. Because the next token is a "+", we know that the derivation is E->T + E; we then know that we should match('+'), then call E().
match('+')
next_token == 3
E()
T()
F()
match(DIGIT)
next_token = $
F() returns
T() returns
E() returns
You then need to map each nonterminal symbol with a respect function. The function E, corresponding to nonterminal E, would then look like this:
Code:
E* E()
{
T* left = T();
E* right = NULL;
E* node = NULL;
if (next_terminal == '+') {
match('+');
right = E();
node = new_tree(left, right);
} else if (next_terminal == '-') {
match('-');
right = E();
node = new_tree(left, right);
} else if (next_terminal == '$') {
node = left;
} else {
error();
}
return node;
}
You might need to make some changes, however. Both T and E must be derived from a base Node class in order for the nodes to mixed together in the tree. Your evaluation order will be a bit different than usual. "3 + 5 + 6" will be evaluated "3 + (5 + 6)", "3 + (5 / 6)". But if you are able to write the recursive descent code, you should then be ready to make some changes to the grammar.