-
Expression Tree
I have come quite far with writing an interpreter. What I understand
is that I need to identify the code written and put this down like an "Expression Tree" of any kind ?
I have tried to really understand the logic.
I do understand that I have to "break down" the whole statement in someway but I am not sure of what has priority over others.
I will take an example where I get stuck.
Code:
a > a2 && b > b2
&&
(((c > c2 || d > d2)
&&
(e > e2 || f > f2))
||
((g > g2 || h > h2)
&&
(i > i2 || j > j2)))
&&
k > k2 && l > l2
My first question is: When parsing this out to make an "Expression Tree", do you read this from Start to End like you read it or do you find priorities as ((..Longest distance between Equal lot of parantheses ..)) parantheses etc...
If I start out to read from start to end with the example above. The first is easy:
a > a2 &&
b > b2
Then I got problem to understand what to do. I understand when looking at it that Either of the green parts has to be true and this I can identify by finding equal lot of parantheses.
Then if Either of the green areas is true then:
k> k2 &&
l > l2
My second question is that if I only would have statements with && wihout parantheses ex:
k> k2 &&
l > l2
I could save these into a vector vec[1] and run this through my interpreter wich do works fine now.
My question here is how I will put all the statements that in this case are green. Will I use 2D, 3D vector for scenarios like this or how do the procedure work.
I am quite confused.
The mainquestion is how to build an expression tree. Do I have to read from left to right and understand what areas is together (((....))) and what have priorities over others like for math: * has priority over + etc...
Thank You!
-
The first question is: do you know what an expression tree is? If not I would look that up, if I were you. (Also called a syntax tree.) The main thing to note is that it is a tree, and so you will need a tree structure (which vector will not provide).
In terms of how to make them, there's how people do it, and how machines do it. How machines do it is called the "shunting yard algorithm".
-
Yes I know what it is and how it look like but not excatly how it is built up.
I didn´t know the name could be "Syntax Tree" also. I find a bit more on this words when googling and will look it up further.
One question that could be good to have in mind as basic understanding when reading about it.
If vectors cant be used ex:
How will the process work ?
Will for example a for-loop parse the entire code out(wich will be a string) one peace at a time and forward/ run syntax through the interpreter.
Thanks
-
I don't know exactly what your interpreter-so-far expects; but the idea behind the shunting yard algorithm does require knowing order-of-operations (basically the last operation that should happen goes at the top; the left side and right side are then handled recursively). So your loop will build the tree. After that's done, then you can worry about deciding whether the tree evaluates to true or false (which would probably involve feeding things to your interpreter).
-
Thanks tabstop, with you explanation I think I could be on the right track hopefully
It is not very easy though :)
I will read about syntax trees, shunting yard algorithm more and try to find a way to parse it out.
As you said I am feeding things to the interpreter to decide true and false and if one thing is false it could still be true until next criteria using flags etc...
Thanks for your help !
-
The easiest way of doing these things is to first formulate the grammar as (E)BNF, including operator priorities, and then translating that to the format your favourite parser generator (yacc/bison, Boost.Spirit, antlr, ...) expects.
Operator priorities can be built into the grammar by correct grouping. For example, the grammar of basic algebra looks something like this:
Code:
start := add_expression
add_expression := mul_expression
| mul_expression '+' add_expression
| mul_expression '-' add_expression
mul_expression := term
| term '*' mul_expression
| term '/' mul_expression
term := '(' add_expression ')'
| number
This grammar builds a correctly prioritized expression tree (dot before dash, left to right) from any simple algebraic expression: in "1 + 2 * 3" the parse tree is this:
Code:
add_expression
-> mul_expression
-> term
-> number ("1")
-> '+'
-> add_expression
-> mul_expression
-> term
-> number ("2")
-> '*'
-> term
-> number ("3")
This leads to this expression tree:
Code:
'+'
-> 1
-> '*'
-> 2
-> 3
Because expression trees are evaluated depth-first, the multiplication is correctly applied before the addition.