Originally Posted by
anduril462
Just for the heck of it, my non-gap version is as follows (also untested...and unoptimized)
It is not enough to test if the immediate children of a node fulfill the BST property; the error may also be between a node and the rightmost child in the left subtree, or between a node and its leftmost child in the right subtree. std10093 had a good example of exactly that:
Originally Posted by
std10093
Code:
| 4
| 2 6
| 1 8 5 7
where 8 > 4, 8 being the rightmost child in 4's left subtree.
"Rightmost child in the left subtree" and "leftmost child in the right subtree" sound complicated and unrelated to the BST propery, but if you think about in-order traversal, they are just the names for the previous and next nodes in order.
Assuming you do not allow duplicated values, in-order traversal makes a much simpler BST order checker function. If duplicate values are allowed, naive in-order traversal would allow trees like
Code:
| 4
| / \
| 2 6
| / \ / \
| 1 4 5 7
which makes locating duplicated values extremely difficult.
You can compensate by handling the case where previous and current values are equal, by verifying that the values are consecutive depth-wise (from current node to the rightmost node in the left subtree being a single "string" of equal values). This allows any distribution strategy for duplicated values (not just "always to left child" or "always to right child"), and all cases where a simple search uncovers all duplicated values correctly.
Let's start by cleaning up the code:
Code:
int ordtree(Treeptr p)
{
int a, b;
return (p == NULL || fun(p, &a, &b));
}
int fun(Treeptr p, /* GAP A */)
{
int a, b;
if (p->left == NULL && /* GAP B */) {
*ap = p->value;
/* GAP C */
}
if (p->right == NULL) {
*bp = p->value;
return (fun(/* GAP D */) && p->value > b);
}
if (/* GAP E */) {
/* GAP F */
return (fun(p->right, &a, bp) && /* GAP G */);
}
return (/* GAP H */);
}
Gap A is determined by the call in ordtree() and the fact that the function body references int pointers ap and bp. Let's assume they're listed in that order, although you might wish to make a mental note about it, if it turns out to not make sense. Because a and b are not initialized or checked by ordtree(), there really is no hard rule for the order. The single recursive call between gaps F and G indicates they likely are in this order.
The first if clause cannot refer to *ap or *bp in gap B, because the entire left subtree of the tree could be empty; both *ap and *bp have uninitialized values in that case. Therefore, the middle if clause is run only if there is a left subtree. Since this node could be the rightmost child, *bp most likely refers to the value of the rightmost child in the left subtree, i.e. the value immediately preceding the value of the current node in in-order traversal. Since the left subtree has all nodes left to the current node, the recursive call must use &b instead of bp; this takes care of gap D.
It does not look like the code can correctly handle duplicated values. For search efficiency, you'll want the duplicated values to be in depth-wise consecutive nodes. Allowing duplicated values here would allow the difficult trees I mentioned earlier. Therefore, the value comparisons must be '<' and '>', rejecting duplicate values.
At this point the second and third if clauses are symmetrical -- second if clause handling the case where current node has no right subtree, and the third if clause handling the case where current node has no left subtree. Therefore, we can fill in gaps E, F, and G, based on the symmetry of the situation alone.
The first if clause must be for the leaf node case, and based on the other two if clause symmetries, should update both *ap and *bp. No tests or recursion are possible, so the body should return with a success. That takes care of gaps B and C.
This leaves gap H. At this point, none of the if clauses matched, so both a and b are uninitialized. No recursion has been done either, so the gap must contain two recursive calls (one to p->left, one to p->right). *ap is irrelevant to the left subtree, and *bp to the right subtree, so we can assume &a and &b are used instead. Based on this, the return clause must be a simple AND combination of the two preceding ones, which makes logical sense too.
We arrive at
Code:
int ordtree(Treeptr p)
{
int a, b;
return (p == NULL || fun(p, &a, &b));
}
int fun(Treeptr p, int *const ap, int *const bp)
{
int a, b;
if (p->left == NULL && p->right == NULL) {
*ap = p->value;
*bp = p->value;
return 1;
}
if (p->right == NULL) {
*bp = p->value;
return (fun(p->left, ap, &b) && p->value > b);
}
if (p->left == NULL) {
*ap = p->value;
return (fun(p->right, &a, bp) && p->value < a);
}
return (fun(p->left, ap, &b) && p->value > b) &&
(fun(p->right, &a, bp) && p->value < a);
}
I haven't bothered checking the above in any way, so I could be utterly wrong. The logic seems sound to me, though.