With the implementation of our tree data structurecomplete, we now look at an example of how a tree can be used to solvesome real problems. In this section we will look at parse trees. Parsetrees can be used to represent real-world constructions like sentences or mathematical expressions.

Figure 1: A Parse Tree for a Simple Sentence

Figure 1 shows the hierarchical structure of a simplesentence. Representing a sentence as a tree structure allows us to workwith the individual parts of the sentence by using subtrees.

Figure 2: Parse Tree for \(((7+3)*(5-2))\)

We can also represent a mathematical expression such as\(((7 + 3) * (5 - 2))\) as a parse tree, as shown inFigure 2. We have already looked at fully parenthesizedexpressions, so what do we know about this expression? We know thatmultiplication has a higher precedence than either addition orsubtraction. Because of the parentheses, we know that before we can dothe multiplication we must evaluate the parenthesized addition andsubtraction expressions. The hierarchy of the tree helps us understandthe order of evaluation for the whole expression. Before we can evaluatethe top-level multiplication, we must evaluate the addition and thesubtraction in the subtrees. The addition, which is the left subtree,evaluates to 10. The subtraction, which is the right subtree, evaluatesto 3. Using the hierarchical structure of trees, we can simply replacean entire subtree with one node once we have evaluated the expressionsin the children. Applying this replacement procedure gives us thesimplified tree shown in Figure 3.

In the rest of this section we are going to examine parse trees in moredetail. In particular we will look at

- How to build a parse tree from a fully parenthesized mathematicalexpression.
- How to evaluate the expression stored in a parse tree.
- How to recover the original mathematical expression from a parsetree.

The first step in building a parse tree is to break up the expressionstring into a vector of tokens. There are four different kinds of tokensto consider: left parentheses, right parentheses, operators, andoperands. We know that whenever we read a left parenthesis we arestarting a new expression, and hence we should create a new tree tocorrespond to that expression. Conversely, whenever we read a rightparenthesis, we have finished an expression. We also know that operandsare going to be leaf nodes and children of their operators. Finally, weknow that every operator is going to have both a left and a right child.

Using the information from above we can define four rules as follows:

- If the current token is a
`'('`

, add a new node as the left childof the current node, and descend to the left child. - If the current token is in the vector
`['+','-','/','*']`

, set theroot value of the current node to the operator represented by thecurrent token. Add a new node as the right child of the current nodeand descend to the right child. - If the current token is a number, set the root value of the currentnode to the number and return to the parent.
- If the current token is a
`')'`

, go to the parent of the currentnode.

Before writing the C++ code, let’s look at an example of the rulesoutlined above in action. We will use the expression\((3 + (4 * 5))\). We will parse this expression into thefollowing vector of character tokens `['(', '3', '+',`

`'(', '4', '*', '5' ,')',')']`

. Initially we will start out with aparse tree that consists of an empty root node. Figure 4illustrates the structure and contents of the parse tree, as each newtoken is processed.

Figure 4: Tracing Parse Tree Construction

Using Figure 4, let’s walk through the example step bystep:

- Create an empty tree.
- Read ( as the first token. By rule 1, create a new node as the leftchild of the root. Make the current node this new child.
- Read 3 as the next token. By rule 3, set the root value of thecurrent node to 3 and go back up the tree to the parent.
- Read + as the next token. By rule 2, set the root value of thecurrent node to + and add a new node as the right child. The newright child becomes the current node.
- Read a ( as the next token. By rule 1, create a new node as the leftchild of the current node. The new left child becomes the currentnode.
- Read a 4 as the next token. By rule 3, set the value of the currentnode to 4. Make the parent of 4 the current node.
- Read * as the next token. By rule 2, set the root value of thecurrent node to * and create a new right child. The new right childbecomes the current node.
- Read 5 as the next token. By rule 3, set the root value of thecurrent node to 5. Make the parent of 5 the current node.
- Read ) as the next token. By rule 4 we make the parent of * thecurrent node.
- Read ) as the next token. By rule 4 we make the parent of + thecurrent node. At this point there is no parent for + so we are done.

From the example above, it is clear that we need to keep track of thecurrent node as well as the parent of the current node. The treeinterface provides us with a way to get children of a node, through the`getLeftChild`

and `getRightChild`

methods, but how can we keeptrack of the parent? A simple solution to keeping track of parents as wetraverse the tree is to use a stack. Whenever we want to descend to achild of the current node, we first push the current node on the stack.When we want to return to the parent of the current node, we pop theparent off the stack.

Using the rules described above, along with the `Stack`

and`BinaryTree`

operations, we are now ready to write a C++ functionto create a parse tree. The code for our parse tree builder is presentedin ActiveCode 1.

The four rules for building a parse tree are coded as the first fourclauses of the `if`

statement on lines 12, 17,23, and 26 of ActiveCode 1. In each case youcan see that the code implements the rule, as described above, with afew calls to the `BinaryTree`

or `Stack`

methods. The only errorchecking we do in this function is in the `else`

clause where a`ValueError`

exception will be raised if we get a token from the vectorthat we do not recognize.

Now that we have built a parse tree, what can we do with it? As a firstexample, we will write a function to evaluate the parse tree, returningthe numerical result. To write this function, we will make use of thehierarchical nature of the tree. Look back at Figure 2.Recall that we can replace the original tree with the simplified treeshown in Figure 3. This suggests that we can write analgorithm that evaluates a parse tree by recursively evaluating eachsubtree.

As we have done with past recursive algorithms, we will begin the designfor the recursive evaluation function by identifying the base case. Anatural base case for recursive algorithms that operate on trees is tocheck for a leaf node. In a parse tree, the leaf nodes will always beoperands. Since numerical objects like integers and floating pointsrequire no further interpretation, the `evaluate`

function can simplyreturn the value stored in the leaf node. The recursive step that movesthe function toward the base case is to call `evaluate`

on both theleft and the right children of the current node. The recursive calleffectively moves us down the tree, toward a leaf node.

To put the results of the two recursive calls together, we can simplyapply the operator stored in the parent node to the results returnedfrom evaluating both children. In the example from Figure 3we see that the two children of the root evaluate to themselves, namely10 and 3. Applying the multiplication operator gives us a final resultof 30.

The code for a recursive `evaluate`

function is shown inListing 1. First, we obtain references to the left and theright children of the current node. If both the left and right childrenevaluate to `None`

, then we know that the current node is really aleaf node. This check is on line 7. If the current node is nota leaf node, look up the operator in the current node and apply it tothe results from recursively evaluating the left and right children.

To implement the arithmetic, we use a dictionary with the keys `'+', '-', '*'`

, and`'/'`

. The values stored in the dictionary are functions from C++’soperator module. The operator module provides us with the functionalversions of many commonly used operators. When we look up an operator inthe dictionary, the corresponding function object is retrieved. Sincethe retrieved object is a function, we can call it in the usual way`function(param1,param2)`

. So the lookup `opers['+'](2,2)`

isequivalent to `operator.add(2,2)`

.

**Listing 1**

class Operator { public: int add(int x, int y){ return x + y; } int sub(int x, int y){ return x - y; } int mul(int x, int y){ return x * y; } int div(int x, int y){ return x / y; }};int to_int(string str) { stringstream convert(str); int x = 0; convert >> x; return x;}string to_string(int num) { string str; ostringstream convert; convert << num; str = convert.str(); return str;}string evaluate(BinaryTree *parseTree) { Operator Oper; BinaryTree *leftC = parseTree->getLeftChild(); BinaryTree *rightC = parseTree->getRightChild(); if (leftC && rightC) { if (parseTree->getRootVal() == "+") { return to_string(Oper.add(to_int(evaluate(leftC)), to_int(evaluate(rightC)))); } else if (parseTree->getRootVal() == "-") { return to_string(Oper.sub(to_int(evaluate(leftC)), to_int(evaluate(rightC)))); } else if (parseTree->getRootVal() == "*") { return to_string(Oper.mul(to_int(evaluate(leftC)), to_int(evaluate(rightC)))); } else { return to_string(Oper.div(to_int(evaluate(leftC)), to_int(evaluate(rightC)))); } } else { return parseTree->getRootVal(); }}int main(){ return 0;}

def evaluate(parseTree): opers = {'+':operator.add, '-':operator.sub, '*':operator.mul, '/':operator.truediv} leftC = parseTree.getLeftChild() rightC = parseTree.getRightChild() if leftC and rightC: fn = opers[parseTree.getRootVal()] return fn(evaluate(leftC),evaluate(rightC)) else: return parseTree.getRootVal()

Finally, we will trace the `evaluate`

function on the parse tree wecreated in Figure 4. When we first call `evaluate`

, wepass the root of the entire tree as the parameter `parseTree`

. Then weobtain references to the left and right children to make sure theyexist. The recursive call takes place on line 9. We beginby looking up the operator in the root of the tree, which is `'+'`

.The `'+'`

operator maps to the `operator.add`

function call, whichtakes two parameters. As usual for a C++ function call, the firstthing C++ does is to evaluate the parameters that are passed to thefunction. In this case both parameters are recursive function calls toour `evaluate`

function. Using left-to-right evaluation, the firstrecursive call goes to the left. In the first recursive call the`evaluate`

function is given the left subtree. We find that the nodehas no left or right children, so we are in a leaf node. When we are ina leaf node we just return the value stored in the leaf node as theresult of the evaluation. In this case we return the integer 3.

At this point we have one parameter evaluated for our top-level call to`operator.add`

. But we are not done yet. Continuing the left-to-rightevaluation of the parameters, we now make a recursive call to evaluatethe right child of the root. We find that the node has both a left and aright child so we look up the operator stored in this node, `'*'`

, andcall this function using the left and right children as the parameters.At this point you can see that both recursive calls will be to leafnodes, which will evaluate to the integers four and five respectively.With the two parameters evaluated, we return the result of`operator.mul(4,5)`

. At this point we have evaluated the operands forthe top level `'+'`

operator and all that is left to do is finish thecall to `operator.add(3,20)`

. The result of the evaluation of theentire expression tree for \((3 + (4 * 5))\) is 23.

Q-1: Take a moment and draw the parse tree for the expression (2*12/6+3)-17+2*0.

You do not need to write anything here.