How many structurally unique combinations of binary tree given N values

**Vespasian** · 11-04-2012

Can anyone please help me with this, I cannot understand the solution to this problem, I really have been looking at this with a nine mile stare and the combination of recursion and for loop is devastating my brain. Is there a step by step explanation that one forum user can explain to me that walks me through this dawnting code?

Suppose you are building an N node binary search tree with the values 1..N. How many structurally different
binary search trees are there that store those values? Write a recursive function that, given the number of distinct
values, computes the number of structurally unique binary search trees that store those values. For example, countTrees(4) should return 14, since there are 14 structurally unique binary search trees that store 1, 2, 3, and 4.

Code:

/*
For the key values 1...numKeys, how many structurally unique
binary search trees are possible that store those keys.
Strategy: consider that each value could be the root.
Recursively find the size of the left and right subtrees.
*/

int countTrees(int numKeys) {
if (numKeys <=1) {
   return(1);
   }
else {
   // there will be one value at the root, with whatever remains
   // on the left and right each forming their own subtrees.
   // Iterate through all the values that could be the root...
   int sum = 0;
   int left, right, root;
   for (root=1; root<=numKeys; root++) {
      left = countTrees(root - 1);
      right = countTrees(numKeys - root);
      // number of possible trees with this root == left*right
      sum += left*right;
   }
   return(sum);
   }
}

**stahta01** · 11-04-2012

Someone tries to answer it here; no idea if they succeeded or not.

How many binary tree can be fo... | CareerCup

Tim S.

**Nominal Animal** · 11-04-2012

The logic is actually quite simple.

Each binary tree structure is defined by the sizes of subtrees at each node.

The values at the nodes are irrelevant.

Consider a binary tree with six values: ◯ ◯ ◯ ◯ ◯ ◯. The overall structure of the tree depends on how many of them are on the left side of the root, and how many on the right side.

Say you pick three to the left subtree. Since one is at root, then you'll have two to put into the right subtree. Using numbers for the number of nodes in subtrees, you get

Code:

   ◯
  ╱ ╲
 3   2

That defined just the overall structure. To get a definite structure, you need to define the shape of the rest of the tree. So, you recurse into each subtree, choosing how many of the leftover nodes go to the left subtree (and therefore how many to the right) at each node.

If you pick "one to left" for the left subtree, and "zero to left" for the right subtree, you end up with

Code:

    ◯
   ╱ ╲
  ◯   ◯ 
 ╱ ╲   ╲
◯   ◯   ◯

This is one of the possible structures for a binary tree containing N=6 values.

Let's concentrate on what happens at each node.

Say you have n nodes to choose from. One of them must be the current node, so you have n-1 to put into the left and right subtrees. Because you can put anywhere from 0 to n-1 into the left subtree, you have n possible choices.

To count the number of structures possible for a binary tree of this size, you loop over i = 0 .. n-1, and find out the number of structures for the left subtree, L_i=countTrees(i) and the number of structures for the right subtree, R_i=countTrees(n-i-1) . Since each left side structure can be paired with each right side structure, the number of structures for each i is L_i·R_i .

(If you loop over i = 1 .. n, then L_i=countTrees(i-1) and R_i=countTrees(n-i).)

Let's look at actual C code at hand.

Code:

int countTrees(int numKeys)
{
    if (numKeys <= 1) {
        return 1;
    } else ..{

An empty subtree (zero nodes) has only one possible structure. A solitary node has likewise exactly one possible structure. Since a subtree with zero or one nodes has only one possible structure, the function will return one for those cases.

Code:

        int sum = 0;
        int left, right, root;

Above, sum is the total number of possible structures, left is the number of possible structures for each left subtree, right is the number of possible structures for each right subtree, and root is the loop index variable.

Code:

        for (root=1; root <= numKeys; root++) {
            left = countTrees(root - 1);
            right = countTrees(numKeys - root);

The above loop is written so that root is one plus the number of nodes in the left subtree. The point is that it will do numKeys iterations, just like I said earlier.

Code:

            sum += left*right;

Remember, the number of combinations for a tree with a specific number of nodes in the left and right subtrees, is the number of combinations in the left subtree multiplied with the number of combinations in the right subtree, because each left subtree structure can be paired with each right subtree structure.

Code:

        }

        return sum;
    }
}

After the loop, the function simply returns the sum of the number of possible structures.

If you were to write the above function mathematically, you'd get something like

Code:

  N₀ = 1
  N₁ = 1
  N_n = sum( N_i-1 N_n-i, i = 1..n)

Adjust the indexing to

Code:

⇔ N_n+1 = sum( N_i-1 N_n+1-i, i = 1..n+1)
⇔ N_n+1 = sum( N_i N_n-i, i = 0..n)

and you get the recurrence relation for Catalan numbers:

Code:

  C_n+1 = sum( C_i C_n-i, i = 0..n)

This is why the countTrees(numKeys) function actually just computes the numKeys'th Catalan number.

Because Catalan numbers also satisfy

Code:

  C_n+1 = C_n (4n + 2)/(n + 2)

you could rewrite the function as

Code:

int countTrees(const int numKeys)
{
    int  total = 1;
    int  n = 1;

    /* Return the numKeys'th Catalan number. */
    while (n < numKeys) {
        total = (4*n + 2) * total / (n + 2);
        n++;
    }

    return total;
}

This is an excellent example of proper optimization: not at the code level, but at the algorithmic level. This version is easy to implement using just about any bignum library, giving you the result as a very large integer, and works even when n is large.

As it is, this only works for numKeys <= 19 (assuming 32-bit ints); up to numKeys <= 33 if you use uint64_t instead. (The first 30 are listed here, so you could just put them into an array if you only need it for small numKeys.)

Of course, the final value can be calculated directly using binomial coefficients,

Code:

    C_n = ( (2n)! ) / ( (n+1)! n! )

i.e.

Code:

double catalan(const double n)
{
    return tgamma(2.0*n + 1.0) / tgamma(n + 2.0) / tgamma(n + 1.0);
}

but the dividend and divisor are very large numbers, and even double-precision floating-point numbers overflowing at n=86, and only n=0..26 are correct when rounded to the nearest integer, limited precision affecting the larger results.

Fun stuff.

Thread: How many structurally unique combinations of binary tree given N values

Thread Tools

Search Thread

Display

How many structurally unique combinations of binary tree given N values

Similar Threads

Adding node values of binary tree

Unique combinations of groups in C++ w/ recursion

creating combinations of array values - recursion??

can anyone teach me how to convert a general tree to binary tree and vice versa?

using structures to assign unique values to words?