Thread: Function to determine if a tree is sorted (i.e. BST)

  1. #16
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by anduril462 View Post
    Just for the heck of it, my non-gap version is as follows (also untested...and unoptimized)
    It is not enough to test if the immediate children of a node fulfill the BST property; the error may also be between a node and the rightmost child in the left subtree, or between a node and its leftmost child in the right subtree. std10093 had a good example of exactly that:
    Quote Originally Posted by std10093 View Post
    Code:
    |        4
    |     2      6
    |   1   8  5   7
    where 8 > 4, 8 being the rightmost child in 4's left subtree.

    "Rightmost child in the left subtree" and "leftmost child in the right subtree" sound complicated and unrelated to the BST propery, but if you think about in-order traversal, they are just the names for the previous and next nodes in order.

    Assuming you do not allow duplicated values, in-order traversal makes a much simpler BST order checker function. If duplicate values are allowed, naive in-order traversal would allow trees like
    Code:
    |    4
    |   / \  
    |  2   6
    | / \ / \
    | 1 4 5 7
    which makes locating duplicated values extremely difficult.

    You can compensate by handling the case where previous and current values are equal, by verifying that the values are consecutive depth-wise (from current node to the rightmost node in the left subtree being a single "string" of equal values). This allows any distribution strategy for duplicated values (not just "always to left child" or "always to right child"), and all cases where a simple search uncovers all duplicated values correctly.



    Let's start by cleaning up the code:
    Code:
    int ordtree(Treeptr p)
    {
        int a, b;
        return (p == NULL || fun(p, &a, &b));
    }
    
    int fun(Treeptr p, /* GAP A */)
    {
        int a, b;
    
        if (p->left == NULL && /* GAP B */) {
            *ap = p->value;
            /* GAP C */
        }
    
        if (p->right == NULL) {
            *bp = p->value;
            return (fun(/* GAP D */) && p->value > b);
        }
    
        if (/* GAP E */) {
            /* GAP F */
            return (fun(p->right, &a, bp) && /* GAP G */);
        }
    
        return (/* GAP H */);
    }
    Gap A is determined by the call in ordtree() and the fact that the function body references int pointers ap and bp. Let's assume they're listed in that order, although you might wish to make a mental note about it, if it turns out to not make sense. Because a and b are not initialized or checked by ordtree(), there really is no hard rule for the order. The single recursive call between gaps F and G indicates they likely are in this order.

    The first if clause cannot refer to *ap or *bp in gap B, because the entire left subtree of the tree could be empty; both *ap and *bp have uninitialized values in that case. Therefore, the middle if clause is run only if there is a left subtree. Since this node could be the rightmost child, *bp most likely refers to the value of the rightmost child in the left subtree, i.e. the value immediately preceding the value of the current node in in-order traversal. Since the left subtree has all nodes left to the current node, the recursive call must use &b instead of bp; this takes care of gap D.

    It does not look like the code can correctly handle duplicated values. For search efficiency, you'll want the duplicated values to be in depth-wise consecutive nodes. Allowing duplicated values here would allow the difficult trees I mentioned earlier. Therefore, the value comparisons must be '<' and '>', rejecting duplicate values.

    At this point the second and third if clauses are symmetrical -- second if clause handling the case where current node has no right subtree, and the third if clause handling the case where current node has no left subtree. Therefore, we can fill in gaps E, F, and G, based on the symmetry of the situation alone.

    The first if clause must be for the leaf node case, and based on the other two if clause symmetries, should update both *ap and *bp. No tests or recursion are possible, so the body should return with a success. That takes care of gaps B and C.

    This leaves gap H. At this point, none of the if clauses matched, so both a and b are uninitialized. No recursion has been done either, so the gap must contain two recursive calls (one to p->left, one to p->right). *ap is irrelevant to the left subtree, and *bp to the right subtree, so we can assume &a and &b are used instead. Based on this, the return clause must be a simple AND combination of the two preceding ones, which makes logical sense too.

    We arrive at
    Code:
    int ordtree(Treeptr p)
    {
        int a, b;
        return (p == NULL || fun(p, &a, &b));
    }
    
    int fun(Treeptr p, int *const ap, int *const bp)
    {
        int a, b;
    
        if (p->left == NULL && p->right == NULL) {
            *ap = p->value;
            *bp = p->value;
            return 1;
        }
    
        if (p->right == NULL) {
            *bp = p->value;
            return (fun(p->left, ap, &b) && p->value > b);
        }
    
        if (p->left == NULL) {
            *ap = p->value;
            return (fun(p->right, &a, bp) && p->value < a);
        }
    
        return (fun(p->left, ap, &b)  && p->value > b) &&
               (fun(p->right, &a, bp) && p->value < a);
    }
    I haven't bothered checking the above in any way, so I could be utterly wrong. The logic seems sound to me, though.

  2. #17
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    The version you posted matches the other attempts. The most difficult part for me was GAP C, because the thread agrees that it needs two commands, an assignment and a return and somehow I felt that a gap should be filled by a command at a time, but clearly you need two commands here.

    > duplicate values
    There must be no duplicate nodes. Source.

    As a matter of fact, the cose I used for testing, doesn't allow duplicate values too.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sorted binary tree
    By std10093 in forum C Programming
    Replies: 6
    Last Post: 08-29-2014, 10:07 AM
  2. determine if tree is strictly binary bst
    By ashir30000 in forum C++ Programming
    Replies: 3
    Last Post: 02-25-2010, 01:51 PM
  3. Inserting value to a sorted binary tree?
    By HappySmileMan in forum C Programming
    Replies: 3
    Last Post: 02-15-2009, 09:34 AM
  4. Function to Determine if Int is Prime
    By Adrian in forum C Programming
    Replies: 7
    Last Post: 11-22-2007, 08:21 AM
  5. Building a Binary Search Tree off sorted data
    By BlackCitan in forum C Programming
    Replies: 0
    Last Post: 12-01-2003, 09:52 PM