Hmm, I'll try to explain the algorithm I use. It works perfectly.
First, define every operator that is going to be used and assign each operator a predecence. Remember that ( and ) also are operators and they should have the highest predecence.
You'll also need a Token class, a token can contain a number or an operator (and functions and identifiers for more advanced parsers). A token class may look like this:
Code:
struct Token()
{
enum {TypeOperator,TypeNumber,TypeExpression} Type;
double num;
char op;
queue<Token*> expression;
};
The expression member is explained later.
I'll use this string as an example: "2*(5+6)^2"
The first task is to convert a string to a std::vector<Token>. I won't explain how this is done here, but it's not that hard using istream.
When the token vector is ready, the vector will look like this:
[ 2 * ( 5 + 6 ) ^ 2 ] size = 9
It is entirally made up by Tokens of TypeNumber and TypeOperator.
Now the real parsing begins, we will scan the vector for operator, processing the operator with the highest predecence first. Parenthesis is parsed using recursion.
Pseudo-code:
Code:
queue<Token> MakeRPNExpression(vector<Token> tvector)
{
for predecence = highest to lowest
for index = 0 to size(tvector)
if tvector[index].Type == TypeOperator and
get_predecence(tvector[index].op) == predecence
{
//Process this operator
//Is it parenthesis?
if vector[index].op == ')'
{
//collect every token within the parenthesis
//remeber to consider nested parenthesis!
//call recursivly
queue<Token> newtkns = MakeRPNExpression(subtokens);
//erase the tokens in the parenthesis
//insert a token with the queue
Token inserttkn(newtkns);
tvector.insert( curr_pos, inserttkn);
}
//Another binary operator
else
{
//get operands
Token left = offset-1;
Token right = offset+1;
queue<Token> new_expr;
//push operands
new_expr.push(left);
new_expr.push(right);
//push operator last
new_expr.push(op);
//Replace the operator and operands
//with a single expression token using
//container functions
}
}
//If everything went OK, the vector
//should now contain only one token
//with the expression in it.
return tvector[0].expression;
}
Very esoteric, I know, but I'll show how it works using my example vector.
The vector now looks like this:
{2} {*} {(} {5} {+} {6} {)} {^} {2}
First, the parentheis is found, the tokens inside is converted resursivly to this: { {5} {6} {+} }, which is returned.
After replacement, the vector looks like this:
{2} {*} { {5} {6} {+} } {^} {2}
Note that the nested {} counts as a single token.
The next operator found is ^, the operands and operator are rearranged and after replacement and merging of the queues the vetor looks like this:
{2} {*}
The last operator found is *, the same procedure is followed and the final vector (only one token now) is:
{ {2} {5} {6} {+} {2} {^} {*} }
And we're done.
2*(5+6)^2 is
2 5 6 + 2 ^ * in RPN
I'll attach some real code from my parser, however I don't think it will help much. It is more generic, using a DataType class instead of number, and features binary and tertiary ooperators.
This typedef is used:
Code:
typedef std::queue<Petter::Token> Expression;
I think this is my longesy post in this forum, but also the hardest to understand, but I'm not a teacher, sorry.