Thread: Function to determine if a tree is sorted (i.e. BST)

  1. #1
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694

    Function to determine if a tree is sorted (i.e. BST)

    I used this code to perform this kind of operation.

    Code:
    int BST(Treeptr p, int MIN, int MAX);
    
    int ordtree_old(Treeptr p) {
      /* BST = Binary Search Tree.
       * You can name your function as you wish.
       * INT_MIN is the minimum value an integer
       * can take. Imagine it as -infinity.
       * We use -oo, to actually do what we did before,
       * just by eyeballing the tree, to see if all the
       * values are OK.
       */
      return BST(p, INT_MIN, INT_MAX);
    }
    
    int BST(Treeptr p, int MIN, int MAX) {
      /* Function was called from some leaf, so p
       * is NULL. We should place that here, because
       * we do not want to process further a NULL pointer.
       * I mean what if we do p.value? That would be an
       * error.
       */
      if (p == NULL)
        return 1;
    
      if (
      /* current value is greater than the min value */
      p->value > MIN
      /* current value is less than the max value */
      && p->value < MAX
      /* We set the left child as the root, MIN does not
       * need to be updated. MAX is the minimum value between
       * current value and MAX. */
      && BST(p->left, MIN, min(p->value, MAX))
      /* We set the right child as the root, MAX does not
       * need to be updated. MIN is the maximum value between
       * current value and MIN. */
      && BST(p->right, max(p->value, MIN), MAX))
        return 1;
      else
        return 0;
    }
    However, on the test of a course, I saw that it was requested to do this with this "code" (.....GAP A.....) and so on are the parts needed to be filled):
    Code:
    int ordtree(Treeptr p)
    { int a, b;
    return (p == NULL || fun(p, &a, &b)); }
    int fun(Treeptr p, .....GAP A.....)
    { int a, b;
    if (p->left == NULL && .....GAP B.....) {
    *ap = p->value;
    .....GAP C..... }
    if (p->right == NULL) {
    *bp = p->value;
    return (fun(.....GAP D.....) && p->value > b); }
    if (.....GAP E.....) {
    .....GAP F.....
    return (fun(p->right, &a, bp) && GAP .....G.....); }
    return (.....GAP H.....); }
    I got very confused with the variables a and b inside fun(). I tried some things and here is my last attempt:

    Code:
    int fun(Treeptr p, int* ap, int* bp)
    { int a, b;
      if (p->left == NULL && .....GAP B.....) {
        *ap = p->value;
        .....GAP C.....}
      if (p->right == NULL) {
        *bp = p->value;
        return (fun(p->left, ap, &b) && p->value > b);}
      if (p->value < *ap) {
        a = *ap;
        return (fun(p->right, &a, bp) && p->value < a);}
      return 0;
    }
    However I feel it's not correct. I would like some guidance here.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Did you actually try to test either of your solutions (your original or the "GAP" version)? Have you identified in which cases this does/doesn't work? Have you run your code through a debugger, or loaded it with debug prints to step through it? Can you provide sample trees for which it doesn't work? That should be your first step.

    Whip up a quick binary tree -- not necessarily BST since you want to be able to create out-of-order data -- and run examples that you know are sorted/unsorted through your function. Start easy with single-node trees (always sorted), then 2-node trees (4 cases: left/right child smaller/larger) and then 3-node trees. Once you get 3-node trees working, you should be just about able to handle anything. The beauty of binary trees is that, the left- and right-sub-trees are generally* just smaller binary trees, and this type of problem breakdown lends well to simple recursive solutions.

    If you're still stuck after testing and debugging your code, then post back with more specifics on your problem, like sample input that doesn't work, and enough code for us to compile and test ourselves.

    * I say "generally". Some specialized binary trees, like the Red-Black tree aren't quite this simple, but Red-Black trees are in many ways more like a B-tree.

  3. #3
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    Yes, the first solutions works. I use this code, modified to host an integer and not a word.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  4. #4
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    After more testing, it seems that the first solution is wrong too. I am now working on the second one however.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  5. #5
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I find the "GAP" version stupid. It uses unnecessary paramaters and variables that just confuse things. My recursive solution only takes a single tree node, has basically the same complexity, and is WAY more readable.

    Start with working on a less convoluted solution. Maybe stop coding for a bit and get the paper and pencil (if you haven't yet). Start writing out cases
    Code:
    // trivial cases first
    empty tree: explicitly sorted
    single-node tree: explicitly sorted
    // expanding on that...
    two-level tree: if left child exists and is < root AND if right child exists and is > root
    // now you worry about sub-trees
    multi-level tree: same as two-level tree, plus ...
    Now start massaging that into pseudo code. Set up your base case for recursion. Use lots of extra variables if you need to, flags to track whether children exist, are ordered, if subtrees are ordered, etc.
    Code:
    if root is null
        return true
    if no children  // this may get "absorbed" into other blocks of code later, but we'll flesh everything out first, then see what we can optimize
        return true
    // maybe set some flags, like left_is_ordered and right_is_ordered
    Once you have the pseudo code worked out, you can post it back here and somebody can check it, or you can jump right into implementation (and of course, post that if you need help).

    Then, once you have it all sorted out, and understand what all the cases you need to check are, then you can start worrying about filling in the "GAP" version.

  6. #6
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    Quote Originally Posted by anduril462 View Post
    I find the "GAP" version stupid. It uses unnecessary paramaters and variables that just confuse things.
    That's what I am saying!

    I think something like this should work:

    Code:
    int fun(Treeptr p, int * ap, int * bp) {
      int a, b;
      if (p->left == NULL && p->right == NULL) {
        *ap = p->value;
        return 1;
      }
      if (p->right == NULL) {
        *bp = p->value;
        return (fun(p->left, &b, ap) && p->value > b);
      }
      if (p->left == NULL) {
        *ap = p->value;
        return (fun(p->right, &a, bp) && p->value < a);
      }
      return (fun(p->right, &a, bp) && fun(p->left, &b, ap));
    }
    However, the code I provided above for the treemanagement.c seems to fail to build a BST, or maybe the printing is not accurate, so I am testing it further in paper. Thanks anduril462.

    If someone can verify that the above is correct or not, (s)he is welcomed.

    [EDIT]
    It appears that the printing was done horizontally and not vertically. So I could test the function and it seems OK.
    Last edited by std10093; 10-03-2014 at 08:08 AM.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  7. #7
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    The last attempt is wrong. It doesn't check the values in a perfect tree. For example,

    Code:
            4
         2      6
       1   8  5   7
    will be considered sorted, and it is not!
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    So I played around with the "gap" version for a bit. I came up with something quite similar to what you had in post #6. I think the only difference was line 9.

    It really helped me when I started putting comments on everything, to explain it in English (I also changed fun to check_balance):
    Code:
    // checks the balance of the tree with root at p, where
    // ap is an output parameter and points to ..........
    // bp is an output parameter and points to ..........
    int check_balance(Treeptr p, int *ap, int *bp)
    {
        int a, b;
    
    
        // no left child and no right child
        if (p->left == NULL && p->right == NULL)
        {
            *ap = p->value;  // given
            *bp = p->value;  // I think this is irrelevant, but not wrong (though it's unconfirmed)
            return 1;  // a node with no children is sorted
        }
    
    
        // left child only
        if (p->right == NULL)
        {
            // recall that bp is the 3rd parameter to this func, regardless of whether &a, &b, ap or bp was passed in by the caller
            *bp = p->value;  // bp stores current node's value for caller
    
            // return true if the left subtree is sorted and the current node is greater than b
            // so b must contain the value of the left child of p, thus it must be passed into (pick ap or bp)
            return (fun(.....GAP D.....) && p->value > b);
        }
    
        ...
    }
    Do that for everything, and you'll figure out what each part is doing. Then, use the fact that if (p->right == NULL) seems to be analogous to the GAP E if block to fill in GAP D, F and G.

  9. #9
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    I don't see what you changed, sorry. I mean, line 9 from my attempt is pretty similar to yours.

    EDIT

    I could keep my last attemt and change the last return with this:

    Code:
      return (fun(p->right, &a, bp) && fun(p->left, ap, &b) && a > p->value && p->value > b);
    Now it seems correct!
    Last edited by std10093; 10-03-2014 at 02:18 PM.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  10. #10
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I was thinking you may have passed the wrong parameters to fun on line 9. I could be wrong though, I didn't look at it as closely as I should have and haven't tested it.

    Also yes, you needed the a > and > b parts on the last return.

  11. #11
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    I tested it it seems OK. Moreover, we want the right child to be more than the parent and the left child to be less, so I feel I am on the right track.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  12. #12
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I still think you have the 2nd and 3rd parameters backwards on line 9 -- you might have missed this in your tests if you don't have enough variety of test cases. My version is:
    Code:
    return (check_balance(p->left, ap, &b) && p->value > b);
    This is all based on my static analysis and no testing, but I feel fairly good about my understanding of the code. In particular, it's the purpose of a and b, and ap and bp, that is important here. From what I can tell, at any level in the recursion, a will store the smallest value in the right subtree (i.e. the smallest value that is larger than the current node), and b the largest value in the left subtree (largest value that is smaller than the current node). If you have a different interpretation, I'd love to hear it.

    To explain my position, we have to look at the known-good code, i.e. the parts that were given. Let's start with
    Code:
    return (fun(p->right, &a, bp) && GAP .....G.....);
    At this level of the tree, our local copy of a will be passed in by reference as we descend to our right sub-tree. That tells me that a will contain information about the right subtree. It could be the immediate right child, or it could be some other node in the tree, or "metadata" like height of said subtree.

    So let's take the analogous case when we have no right child.
    Code:
    return (fun(.....GAP D.....) && p->value > b);
    Whatever GAP D is, we know that b must relate to the value of some node in the left subtree, since p->value must be greater than it. Furthermore, since we don't see b being set anywhere explicitly, so it must be set by passing it by value as part of GAP D. So do we pass it as the 2nd param (into ap) or the 3rd param (into bp)?

    So let's take what we know about filling *ap and *bp with values
    Code:
    if (p->left == NULL && .....GAP B.....) {  // if no left child and some other condition
        *ap = p->value;  // store the current node's value in *ap, i.e. pass it back to the caller
        .....GAP C.....  // do some other stuff
    }
    if (p->right == NULL) {  // if we have no right child
        *bp = p->value;  // store the current node's value in *bp, i.e. pass it back to the caller
    }
    If there is no right subtree, the the current node is the largest of it's subtree, since everything to the left should be smaller.

    Note that, based on the comparison p->value > b, the temptation to simply say "a store the right child's value and b the left child's value" is not a valid conclusion. To do that, the caller would need some sort of information as to whether it is it's parent's left or right child, to know whether to populate ap or bp. But we can see that it doesn't have that knowledge, it only knows about it's descendants, not it's ancestors. So the only thing it can say is "I'm the biggest node I know of" (nothing smaller to it's left), "I'm the smallest node I know of" (nothing bigger to it's right), or "I'm the biggest and smallest (aka the only) node I know of" (no children on either side).

    Makes sense to me, maybe I'm crazy, maybe you have a better explanation.

  13. #13
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    No I am not saying you are crazy! I see your point. So, can you present your version of the function to take a look at it?
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  14. #14
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    I'm stressing heavily the fact that this is untested. It's identical to yours except the ordering of parameters on "line 9" and an extra assignment in the "no child" case:
    Code:
    int ordtree(Treeptr p)
    {
        int a, b;
    
    
        return (p == NULL || check_balance(p, &a, &b)); 
    }
    
    
    int check_balance(Treeptr p, int *ap, int *bp)
    {
        int a, b;
    
    
        if (p->left == NULL && p->right == NULL)
        {
            *ap = p->value;
            *bp = p->value;
            return 1;
        }
    
    
        if (p->right == NULL) 
        {
            *bp = p->value;
            return (check_balance(p->left, ap, &b) && p->value > b); 
        }
    
    
        if (p->left == NULL)
        {
            *ap = p->value;
            return (check_balance(p->right, &a, bp) && p->value < a);
        }
    
    
        return (check_balance(p->left, ap, &b) && p->value > b &&
                check_balance(p->right, &a, bp) && p->value < a);
    }
    Just for the heck of it, my non-gap version is as follows (also untested...and unoptimized):
    Code:
    bool tree_sorted(tree *root)
    {
        if (root == NULL) {
            return true;
        }
    
    
        bool is_sorted = true;
    
    
        if (root->left != NULL) {
            if (root->left->data >= root->data) {
                is_sorted = false;
            } else if (!tree_sorted(root->left)) {
                is_sorted = false;
            }
        }
        if (root->right != NULL) {
            if (root->right->data <= root->data) {
                is_sorted = false;
            } else if (!tree_sorted(root->right)) {
                is_sorted = false;
            }
        }
    
    
        return is_sorted;
    }

  15. #15
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,694
    I checked the first piece of code you have. I see only the different order in the last return and not an extra assignment....oh, I now saw that I hadn't let you know that I had that extra assignment in my version too! Damn, how did I forget that? So maybe, the order is not a problem, but rather an elegant arrangement in this case.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sorted binary tree
    By std10093 in forum C Programming
    Replies: 6
    Last Post: 08-29-2014, 10:07 AM
  2. determine if tree is strictly binary bst
    By ashir30000 in forum C++ Programming
    Replies: 3
    Last Post: 02-25-2010, 01:51 PM
  3. Inserting value to a sorted binary tree?
    By HappySmileMan in forum C Programming
    Replies: 3
    Last Post: 02-15-2009, 09:34 AM
  4. Function to Determine if Int is Prime
    By Adrian in forum C Programming
    Replies: 7
    Last Post: 11-22-2007, 08:21 AM
  5. Building a Binary Search Tree off sorted data
    By BlackCitan in forum C Programming
    Replies: 0
    Last Post: 12-01-2003, 09:52 PM