Parsing mathematical function to tree structure

Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last
• 10-02-2004
Roule
Parsing mathematical function to tree structure
Background Banter:
As per salem's advice earlier today ive learnt how to convert from infix to postfix using a stack, and how to then evaluate that RPN expression.

My next question is how to do this using a tree, ive done quite a bit of reading and have figured out how to evaluate an expression already parsed into the tree, but I cant find a resource, book or otherwise documenting how to parse an infix expression into a tree structure. Generally speaking im a little tired and may have missed something but im normally thorough with searching.

Question:

1 - With an algebraic, possibly containing complex arithmeitic, infix expression passed as an input, how do you parse this input expression into a tree structure, operators as nodes, variables as child, in order to then evaluate using a recursion algorithm.

This topic is technically different to the last one I started, although they are linked by commonality of purpose. My thinking was that this is how the admin's would prefer, if I'm wrong please accept my apologies, delete this and I will append to my earlier post.
• 10-02-2004
misplaced
hey...sounds like we're working on close to the same project....i think you're a bit ahead of me though....

http://www.brpreiss.com/books/opus4/html/page262.html

• 10-02-2004
misplaced
i guess i didn't really read your post all that well at first.....

my best solution so far is..(for simplicity i'll only use number constants)
i will use the terms brother for one node child and sister for the other
Code:

```read number - store in root read sign - push down root (sign is in root now, first number is in root's child) read number - store in roots child go to current parent read sign if sign is - or +       create parent (= root)       store sign in new parent       create daughter       read number       store in daughter else       create sister (= child on right side if you were to visualize the tree)       store * or / in new child (sister)       read number       store in newly created child```

something like that...i can find were i wrote it down

the key is when you come across - or + to create a new parent (or push all the other nodes down) .......follow order of operations. the first in the oder should be at the bottom (leaves) of the tree..... parenthisis will be the lowest, then multiplication, then division, then subtraction, then addition will be on top....

[/code]
• 10-02-2004
Sang-drax
Shouldn't be too hard...

Find the operator with the lowest priority, store it as the root node with the left and right expressions as subnodes.
Repeat for both subnodes.
• 10-02-2004
misplaced
you make it sound so easy
• 10-02-2004
Sang-drax
It's not very hard.
One thing that complicates it a little is characters that can act as both unary and binary operators (-).

One way to handle parenthesis is to add priority to every operators within () when tokenizing the expression. If you assign priorities 1-10 to your operators, an operator within n levels of () would have an addtional priority of 11*n.
After this adjustment, the parenthesies are removed.
• 10-02-2004
Roule
Misplaced, do you also have to parse complex arithmetic? Whats the title of your project?

Thanks for that, that pseudo code does look right.

Sang-drax, the way I understand it I can deal with that unary/binaryop duality issue in my scanner/tokeniser.

Back to work, misplaced can share my code with you later on in the week if you like.
• 10-02-2004
okinrus
I think you are trying to use operator precidence parsing. I have never used tried to create this kind of parser but you should be able to visualize the algorithm. That is, you have an expression such as "4 * (4 + 6)" and you represent precidence as
"<<4> * <(<4> + <6>)>>" Of course, it's a little more complicated because most algorithms do this with only one token of lookahead.

If you know grammars, just write a grammar for your expressions, weed out the left recursion, and write a recursive descent parser.

But if you convert an infix expression to reverse-polish notation, then you should be able to create a tree. For instance, if you have (+ 4 (* 3 (+ 6 3))) you recursively eval to create a tree. That is, you create a tree with + as its operator, 4 as its left node and the rerturn of eval(* 3 (+ 6 3)) as its right node.
• 10-02-2004
misplaced
i don't have a name for my project yet, but because i'm bored and have 3-4 more semesters of math courses i decided to write a program to simplify equations and solve them..after i get basics i will add more functionality to it....

i've broken the parsing into 2 parts...the first part seperates the signs from the terms..
the second part will store it in a tree.. i don't know if this is the best way to do it, but so far i've written the first part of parsing algorithm which returns a vector of strings...each operator (-+/*(){}[]) is it's own string and then each term is its own string in the same order. it looks like this
| 3 | + | 5xw | + | 2 | / | ( |c |- | d | ) | (3+5xw+2/(c-d))
| <-one element of vector<string>)-> |

i've come to a stopping point here because i need to finish my tree...or start it even...
but my plan from there, after i get the tree done, is to iterator through the vector and depending on what symbol i come across, i will either move up in the tree or down (like in the psuedo code). i doubt you will want to use it, but next time i'm on the unix side of my system i will send over to windows so i can post it here and you can take a look at it.
i know you probably want the second part, but ...so do i...
• 10-02-2004
misplaced
i'm only in my second semester of programming, so this code may or may not look like a joke to you, but maybe you can look at it and get some ideas (even if they're ideas of what not to do)

Code:

```#include <string> #include <vector> using namespace std; //return true if x is  */-+(){}[]= //[]{}() precedence operators #define isOp(x) (x=='+'||x=='-'||x=='/'||x=='*'||x=='{'||x=='{'||x=='['||x==']'||x=='('||x==')'||x=='='||) //this function accepts a string to be parsed. the function will //then put each part (substring) into a vector and returns the vector. // -+\*[](){}'s will be placed in it own string inside the vector // all other characters will be placed together if not seperated by an operator // a/b+c(dx - c) = |a| / |b| + |c| ( |dx| - |c| ) |  //source is the expression in the form of a string vector<string> Xpress::parse(string source, vector<string> &list) {         string temp("");                        //start with empty buffer         //might want to insert '+' if source[0] is not '-' or '+'                 //loop through source and collect characters in buffer until a operator         //is found. if operator found flush buffer; store operator in buffer; flush;         int length = source.length();         char current;         for(int i = 0; i <= length; i++)         {                 current = source[i];                 if(isOp(current))                //if source[i]==/+-*()[]}or{                 {                         list.push_back(temp);    //insert buffer into vect                         temp = "";                //clear buffer for operator                         temp = temp + current;  //store source[i] into temp                         list.push_back(temp);    //insert operator                         temp = "";                //clear buffer                 }else{                         temp = temp + current;  //append character onto temp                 }                                }                 list.push_back(temp);  //finish the job and clear buffer 1 last time         return list; }```
oh yea, this stores exponents with the terms....exponents represented by ^ will be parsed by a term class, not by the above function.
• 10-03-2004
Roule
I'll have to come back to this thread later misplaced, at the moment not in a situation to comment. Thanks for the help though.
• 10-03-2004
misplaced
is it neccessary for it to go straight from infix to a tree? because after doing some reading it seems MUCH easier converting infix to prefix and then building the tree. I've been working on the infix to tree algorithm all night and it just ends up becoming one big mess.
• 10-03-2004
okinrus
Yes, it's possible, but not necessary, to go from infix to the tree without any intermediate step. I suggest you do this with recursive descent, because the algorithm is easier to understand. If you have a right-recursive grammar such as
E->T + E | T - E | T
T->F * T | F / T | F
F->digit | (E)
You can write a lexer that will return the next token and a match function that will match a given token, call an error routine if the next token does not match, then scan the next token. At each step, the next token determines which nonterminal you call. If you have ,for instance, the expression "4 + 3\$", where '\$' is the EOF character, your recursive call sequence looks like as follows:
next_token == 4
E()
T()
F()
digit
match(DIGIT)
next_token == +
F() returns
T() returns
Were're now at E. Because the next token is a "+", we know that the derivation is E->T + E; we then know that we should match('+'), then call E().
match('+')
next_token == 3
E()
T()
F()
match(DIGIT)
next_token = \$
F() returns
T() returns
E() returns

You then need to map each nonterminal symbol with a respect function. The function E, corresponding to nonterminal E, would then look like this:
Code:

```E* E() {         T* left = T();         E* right = NULL;         E* node = NULL;         if (next_terminal == '+') {               match('+');               right = E();                      node = new_tree(left, right);         } else if (next_terminal == '-') {               match('-');               right = E();               node = new_tree(left, right);         } else if (next_terminal == '\$') {               node = left;         } else {                 error();         }         return node; }```
You might need to make some changes, however. Both T and E must be derived from a base Node class in order for the nodes to mixed together in the tree. Your evaluation order will be a bit different than usual. "3 + 5 + 6" will be evaluated "3 + (5 + 6)", "3 + (5 / 6)". But if you are able to write the recursive descent code, you should then be ready to make some changes to the grammar.
• 10-03-2004
misplaced
hehe..i was asking Roule is it was necessary to use infix->tree...

since my last post i have changed my design a bit.......
the function above will be changed to a stack....and then i will convert to prefix notation....
then build the tree....
• 10-04-2004
Roule
Misplaced, are you doing ece4709 by any chance?
Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last