![]() |
| | #1 |
| Code Goddess Join Date: Sep 2001
Posts: 9,661
| Request for comments ![]() Binary Search Trees Arrays are useful in that they allow quick access to any item in the data structure, but they fall flat for insertion and deletion of an item anywhere except at the tail end. Linked lists solve this problem by removing the restriction that every item be contiguous in memory, thus introducing the concept if a linked data structure. With a linked list you can insert and delete items anywhere in the list with relative ease, but then the problem of efficient searching arises. With arrays you could at least sort the data and use a binary search to find an item in logarithmic time, but the binary search algorithm requires random access to the data structure. A linked list does not support random access, therefore binary search is not practical. Instead of being restricted to linear search times and severely limiting the usefulness of linked data structures, a linked data structure corresponding to the binary search can be used. These data structures are called, not surprisingly, binary search trees. The idea behind a binary search tree is that the median value is the head of the data structure, the item you begin searches with. This item then links to two others, a "left" link, and a "right" link, both of which follow the same structure. The resulting representation looks like this: Code: Diagram 1:
5
3 7
2 4 6 8
Definition 1: A binary tree is a node with links to two subtrees, called the "left" subtree and "right" subtree, respectively. Definition 2: A binary search tree is a binary tree whose left subtree contains values less than itself, and whose right subtree contains values greater than itself. A tree node is similar to a double linked list node in that it contains at least a data item, and two links: Code: Source 1:
struct dlnode {
struct dlnode *prev;
struct dlnode *next;
int data;
};
struct tree_node {
struct tree_node *left;
struct tree_node *right;
int data;
};
Code: Source 2:
struct pretree_node {
struct pretree_node *link[2];
int data;
};
struct pretree {
struct pretree_node *header;
size_t size;
};
Searching As the name implies, binary search trees are designed to accelerate the searching process, so it stands to reason that we should examine the searching algorithm first. Searching a binary search tree is straightforward: Examine the value of a node, if the value being searched for is less then move to the left link, if the value being searched for is greater then move to the right link. If the value is equal to the value being searched for, the search has succeeded. Some readers will have noticed that we are missing a case; what if the value being searched for is not in the tree? The answer depends on how the tree is structured, but for the most part, empty links will be marked by null pointers. If the next link to follow is a null pointer, it is safe to assume that the value does not exist. Using the above description, an algorithm is immediate: Code: Source 3:
1. int pretree_find (
2. struct pretree *tree,
3. int search_key )
4. {
5. struct pretree_node *walk;
6. int dir;
7.
8. walk = tree->header;
9. for ( ; ; ) {
10. /* Fail if a dead link is found */
11. if ( walk == NULL )
12. return 0;
13. /* Succeed if the item is found */
14. if ( walk->data == search_key )
15. return 1;
16. /* Pick a direction and go there */
17. dir = walk->data < search_key;
18. walk = walk->link[dir];
19. }
20. }
Because the definition of a binary search tree is itself recursive (ie. Each link is its own binary search tree), we can use a recursive approach to searching: Code: Source 4:
1. static int find_subtree (
2. struct pretree_node *header,
3. int search_key )
4. {
5. if ( header == NULL )
6. return 0;
7. else if ( header->data != search_key ) {
8. int dir = header->data < search_key;
9. return find_subtree ( header->link[dir], search_key );
10. }
11.
12. return 1;
13. }
14.
15. int pretree_find (
16. struct pretree *tree,
17. int search_key )
18. {
19. return find_subtree ( tree->header, search_key );
20. }
The recursive algorithm is simple, if not immediately transparent to readers. If the node is a null pointer, we return a failure code. Otherwise, if the value is not equal to the value being searched for, we find a link to follow as with the iterative algorithm, and then return the result of a recursive call to the function. Note that if the node's value is the same as the search key, the if..else construct is terminated and a success code is returned. A recursive call is only made if the node is not null and the value is not found. Be sure to study this algorithm as it is more difficult to follow than the previous one. See Exercise 1 before using these two functions. Insertion A binary search tree is useless until you have a method for inserting items into it, but before we get into that a little bit of terminology is in order. Term 1: The parent of a node is the node which links to it. In the following example, C is the parent of B because B is C's left link (Note: When possible I will use / and \ between two nodes to emphasize the relationship): Code: Diagram 2: C / B Term 3: The root of a tree is a node that has no parent. Term 4: A leaf is a node that has no children. Insertion into a binary search tree occurs at a leaf node. This suggests that we begin by searching for a null link, then we insert the new item at that location. What follows is an example construction of a binary search tree using the values 4, 7, 1, 2, 5, 6, 3 to show how this works: Code: Diagram 3:
4:
4
7:
4
\
7
1:
4
/ \
1 7
2: 4
/ \
1 7
\
2
5:
4
/ \
1 7
\ /
2 5
6:
4
/ \
1 7
\ /
2 5
\
6
3:
4
/ \
1 7
\ /
2 5
\ \
3 6
Code: Source 5:
1. int pretree_insert (
2. struct pretree *tree,
3. struct pretree_node *new_item,
4. int search_key )
5. {
6. struct pretree_node *walk;
7. struct pretree_node **save;
8. int dir;
9.
10. /*
11. * Search for an empty link
12. * Return failure if a duplicate is found
13. */
14. save = &tree->header;
15. walk = tree->header;
16. for ( ; ; ) {
17. if ( walk == NULL )
18. break;
19. else if ( walk->data == search_key )
20. return 0;
21. dir = walk->data < search_key;
22. save = &walk->link[dir];
23. walk = walk->link[dir];
24. }
25. /*
26. * Insert the new item
27. */
28. *save = new_item;
29. tree->size++;
30.
31. return 1;
32. }
we will be ignoring them, so we return failure. When walk is null, we break out of the loop and assign the new node to the link that points there. Note that if we assigned directly to walk, the new node would not be afixed to the tree, it would simply be floating in limbo, and that isn't very useful since upon returning we would lose it. Next we add one to the size of the tree and return success. Study this code very carefully as it introduces the rather difficult to grasp technique of using a double pointer to save the link followed. As with simple searching, insertion can be easily converted to a recursive solution: Code: Source 6:
1. struct pretree_node *insert_subtree (
2. struct pretree_node *header,
3. struct pretree_node *new_item,
4. int search_key )
5. {
6. if ( header == NULL )
7. header = new_item;
8. else if ( header->data != search_key ) {
9. int dir = header->data < search_key;
10. header->link[dir] = insert_subtree ( header->link[dir], new_item, search_key );
11. }
12.
13. return header;
14. }
15.
16. void pretree_insert (
17. struct pretree *tree,
18. struct pretree_node *new_item,
19. int search_key )
20. {
21. tree->header = insert_subtree ( tree->header, new_item, search_key );
22. tree->size++;
23. }
Traversal Now that we have the code necessary to build a binary search tree, we can do things with it. Naturally, you can use the search functions defined in Source 3 and Source 4 to determine if an item is in the tree, but it would be nice to verify that the insertion functions work properly. It would also be nice to be able to perform an operation on every item in the list, such as displaying them. And of course, since we are most likely using dynamic memory to allocate each node in the tree, it would be wise to release that memory at some point. This brings us to the field of tree traversal. One would think that visiting every node in a tree would be simple and consist of one, maybe two different cases. Surprisingly, this is not the case. There are actually N! different ways to traverse a binary search tree of N nodes, but most of them are useless. Of the many ways to traverse a binary search tree, we will be looking at two categories: breadth-first, where we will use the level-order traversal as an example, and depth-first, where we will use preorder, inorder, and postorder traversals as examples. We will then go on to discuss more exotic traversal techniques. Breadth-First Traversal Breadth-first traversal is visiting each node in a binary search tree, starting from the root and moving down level by level. Traditionally, breadth-first traversal visits each node on the same level from left to right. In the following diagram, the breadth-first traversal would visit nodes in the order 5, 3, 7, 2, 6, 8. Code: Diagram 4:
5
3 7
2 6 8
Term 5: The height of a node is the number of nodes in its highest subtree. The height of node 3 in Diagram 4 is 1, while the height of node 5 is 2. Term 6: Each node with the same height is on the same level. In Diagram 4, nodes 3 and 7 both have a height of 1, so they are on the same level. A breadth-first traversal algorithm is often implemented as the level-order traversal. While there can be differences, most often level-order traversal matches the definition of breadth-first traversal. This is one of the algorithms on a binary search tree that cannot be written recursively. For those that are familiar with recursion, you know that a recursive solution can be simulated through the use of a stack. Level-order traversal requires the use of a queue, however, so recursion is not a practical option. The algorithm itself is very simple, visit the first node saved, then save its left and right links. Continue to do this until there are no more links to visit. An example using Diagram 4 follows: Code: Diagram 5:
save 5, queue = { 5 }
visit 5, queue = {}
save 3, queue = { 3 }
save 7, queue = { 7, 3 }
visit 3, queue = { 7 }
save 2, queue = { 2, 7 }
visit 7, queue = { 2 }
save 6, queue = { 6, 2 }
save 8, queue = { 8, 6, 2 }
visit 2, queue = { 8, 6 }
visit 6, queue = { 8 }
visit 8, queue = {}
Code: Source 7:
void pretree_levelorder (
struct pretree_node *header,
void (*action) ( struct pretree_node *node ) )
{
struct pretree_node *save[50];
int front = 0;
int back = 0;
if ( header == NULL )
return;
save[front++] = header;
while ( front != back ) {
header = save[back++];
action ( header );
if ( header->link[0] != NULL )
save[front++] = header->link[0];
if ( header->link[1] != NULL )
save[front++] = header->link[1];
}
}
Depth-First Traversal Depth-first traversals begin by moving as far left or right as possible, as opposed to breadth-first traversals which begin with the root. These traversals then move up one link and move left or right. This process is repeated until all nodes have been visited. The question of course, is when do the nodes get visited? Since we can only move in one of two directions, we can determine that there are only six possible ways to traverse and visit in the depth-first variation. Recognizing that you can move left, move right, or visit with every operation, the following variants are available: 1: visit, move left, move right 2: visit, move right, move left 3: move left, visit, move right 4: move right, visit, move left 5: move left, move right, visit 6: move right, move left, visit Of these six, three are common enough to give standardized names: Traversal 1 is referred to as preorder because the visitation is performed before any movement is made, Traversal 3 is referred to as inorder bcaus the visitation is performed in between the two movements and results in a traversal in the order of the values, and finally, Traversal 5 is called postorder because the visitation is performed after both movements. Each of these can be written using a short and elegant recursive scheme: Code: Source 8:
void pretree_preorder (
struct pretree_node *header,
void (*action) ( struct pretree_node *node ) )
{
if ( header == NULL )
return;
action ( header );
pretree_preorder ( header->link[0], action );
pretree_preorder ( header->link[1], action );
}
void pretree_inorder (
struct pretree_node *header,
void (*action) ( struct pretree_node *node ) )
{
if ( header == NULL )
return;
pretree_inorder ( header->link[0], action );
action ( header );
pretree_inorder ( header->link[1], action );
}
void pretree_postorder (
struct pretree_node *header,
void (*action) ( struct pretree_node *node ) )
{
if ( header == NULL )
return;
pretree_postorder ( header->link[0], action );
pretree_postorder ( header->link[1], action );
action ( header );
}
Despite these wonderfully concise solutions, it is often prudent to take the iterative approach to traversal. These traversals are more difficult because of the dual recursive calls. If you direct your attention to the definition of a level-order traversal you may notice similarities with the preorder traversal if the queue were replaced with a stack. Believe it or not, that is precisely what needs to be done to write an iterative preorder traversal: Code: Source 9:
void pretree_preorder (
struct pretree_node *header,
void (*action) ( struct pretree_node *node ) )
{
struct pretree_node *save[50];
int top = 0;
if ( header == NULL )
return;
save[top++] = header;
while ( top != 0 ) {
header = save[--top];
action ( header );
if ( header->link[1] != NULL )
save[top++] = header->link[1];
if ( header->link[0] != NULL )
save[top++] = header->link[0];
}
}
Code: Source 10:
void pretree_inorder (
struct pretree_node *header,
void (*action) ( struct pretree_node *node ) )
{
struct pretree_node *save[50];
int top = 0;
while ( header != NULL ) {
while ( header != NULL ) {
if ( header->link[1] != NULL )
save[top++] = header->link[1];
save[top++] = header;
header = header->link[0];
}
header = save[--top];
while ( top != 0 && header->link[1] == NULL ) {
action ( header );
header = save[--top];
}
action ( header );
header = ( top != 0 ) ? save[--top] : NULL;
}
}
It is possible to implement depth-first traversals without the aid of a stack or recursion. The biggest concern with either an implicit stack (through recursion) or an explicit stack is that of the extra time and space required to maintain the stack. If the tree is skewed so that the height is equivalent to the number of nodes, the potential exists for stack overflow in recursive and fixed size stack solutions, and for explicit variable size stack solutions, the potential for taking up almost the same amount of space and time as the tree itself is uncomfortably likely. To counter this, the concept of threads was developed. Threads are pointers to the successor and predecessor of a node in a predefined sequence, such as inorder traversal. Of course, this would add two pointers to each node of the tree, often an unacceptable increase in size. This problem can be solved by overloading existing unused pointers. While each node would still need a flag of some sort to determine if a link is normal or overloaded, this flag need not take up nearly as much valuable space as a full link. Using threads, tree traversal is simple. Here is some pseudocode for one possible implementation of inorder traversal using threads: Code: Source 11:
sub threaded_inorder ( tree )
node walk := tree
node prev
if walk = null then
return
endif
while walk.left != null do
walk := walk.left
loop
while walk != null do
visit ( walk )
prev := walk
walk := walk.right
if walk != null then
if prev.successor = 0 then
while walk.left != nul do
walk := walk.left
loop
endif
endif
loop
end sub
Code: Diagram 6:
5
/ \
2 7
/ \
1 3
Code: Diagram 7:
2
/ \
1 3
\
5
\
7
Code: Source 12:
void xform_inorder (
struct pretree *tree,
void (*action) ( struct pretree_node *node ) )
{
struct pretree_node *walk = tree->header;
struct pretree_node *heir;
while ( walk != NULL ) {
if ( walk->link[0] == NULL ) {
action ( walk );
walk = walk->link[1];
}
else {
heir = walk->link[0];
while ( heir->link[1] != NULL
&& heir->link[1] != walk )
{
heir = heir->link[1];
}
if ( heir->link[1] == NULL ) {
heir->link[1] = walk;
walk = walk->link[0];
}
else {
action ( walk );
heir->link[1] = NULL;
walk = walk->link[1];
}
}
}
}
At some point in a tree's life, it would be nice to be able to remove an item. Item removal is one of the primary reasons that a developer will choose not to use a binary search tree. Deletion is not a trivial task, especially when more advanced trees are used to guarantee logarithmic performance. The problem with deletion is that there are several cases of varying complexity. How difficult it is to remove a node depends heavily on where the node exists in the tree structure. There are three cases: Case 1: The node is a leaf, simply set the link followed to it to null. Case 2: The node has one child, set the link followed to it to point to the child. Case 3: The node has two children. This is the hardest case and can be performed several ways, we will discuss two of them. Deletion By Merging. This is the most common method of deletion in books and other educational text. It begins by finding the successor (or predecessor) of the node to be removed. Then the node's left (or right) link is spliced into the successor's (or predecessor's) matching link. Reset the link that was followed and the merging is complete. An example follows: Code: Diagram 8:
(Remove 5)
| |
7 9
/ \ /
4 9 --> 4
/ \ / \
2 5 2 5
Code: Source 13:
int pretree_remove (
struct pretree *tree,
struct pretree_node **old_item,
int search_key )
{
struct pretree_node *walk;
struct pretree_node **save;
int dir;
/*
* Search for a matching link
* Return failure if not found
*/
save = &tree->header;
walk = tree->header;
for ( ; ; ) {
if ( walk == NULL )
return 0;
else if ( walk->data == search_key )
break;
dir = walk->data < search_key;
save = &walk->link[dir];
walk = walk->link[dir];
}
/*
* Remove the old item
*/
if ( walk->link[1] == NULL ) {
*save = walk->link[0];
*old_item = walk;
}
else if ( walk->link[0] == NULL ) {
*save = walk->link[1];
*old_item = walk;
}
else {
struct pretree_node *heir = walk->link[1];
while ( heir->link[0] != NULL )
heir = heir->link[0];
heir->link[0] = walk->link[0];
*save = walk->link[1];
*old_item = walk;
}
tree->size--;
return 1;
}
Deletion By Copying Deletion by merging is not an optimal solution because it can cause the height of the tree to increase. If at all possible, your tree code should keep a tree at the smallest height possible to accelerate searching. In some cases deletion by merging can result in a highly unbalanced tree, so we should work toward something a little better. Fortunately, such an algorithm has already been devised. If the node has two children, we can reduce the case to one of the simpler cases by replacing the node with its immediate successor or predecessor and removing the successor (or predecessor): Code: Source 14:
int pretree_remove (
struct pretree *tree,
struct pretree_node **old_item,
int search_key )
{
struct pretree_node *walk;
struct pretree_node **save;
int dir;
/*
* Search for a matching link
* Return failure if not found
*/
save = &tree->header;
walk = tree->header;
for ( ; ; ) {
if ( walk == NULL )
return 0;
else if ( walk->data == search_key )
break;
dir = walk->data < search_key;
save = &walk->link[dir];
walk = walk->link[dir];
}
/*
* Remove the old item
*/
if ( walk->link[1] == NULL ) {
*save = walk->link[0];
*old_item = walk;
}
else if ( walk->link[0] == NULL ) {
*save = walk->link[1];
*old_item = walk;
}
else {
struct pretree_node *heir = walk->link[1];
struct pretree_node *prev = walk;
while ( heir->link[0] != NULL ) {
prev = heir;
heir = heir->link[0];
}
walk->data = heir->data;
prev->link[prev == walk] = heir->link[1];
*old_item = heir;
}
tree->size--;
return 1;
}
Deletion by copying does not increase the height of the tree, but it can still result in an unbalanced tree if used frequently with insertions. The reason is because the algorithm is asymmetric, it always removes the successor. The correct number and type of insertions and deletions can result in a left-unbalanced tree. See exercise 16 for a solution. Parent Pointers A nice feature of any data structure is that you can make it as complex as you want to simplify certain operations. Using a stack seems cumbersome for some iterative tree operations, and recursion should be avoided whenever possible and/or sensible. Recall how adding a pointer to the previous node in linked lists simplified everything so much? Perhaps the same thing would be as effective in binary trees, a pointer to the parent perhaps? It only requires a slight change to our node declaration to get started: Code: Source 15:
struct pretree_node {
struct pretree_node *link[2];
struct pretree_node *parent;
int data;
};
Code: Source 16:
int pretree_insert (
struct pretree *tree,
struct pretree_node *new_item,
int search_key )
{
struct pretree_node *walk;
int dir;
/*
* Handle empty tree case
*/
if ( tree->header == NULL ) {
tree->header = new_item;
tree->header->parent = NULL;
return 1;
}
/*
* Search for an empty link
* Return failure if a duplicate is found
*/
walk = tree->header;
for ( ; ; ) {
if ( walk->data == search_key )
return 0;
dir = walk->data < search_key;
if ( walk->link[dir] == NULL )
break;
walk = walk->link[dir];
}
/*
* Insert the new item
*/
walk->link[dir] = new_item;
new_item->parent = walk;
tree->size++;
return 1;
}
So far, parent pointers appear immensely useful and also easy to maintain. Well, unfortunately, any further maintenance such as simple rotations (You will learn about this later) and the splicing required by deletion operations requires considerable thought and care in resetting the parent pointers. As an example, here is one possible deletion function using parent pointers: Code: Source 17:
int pretree_remove (
struct pretree *tree,
struct pretree_node **old_item,
int search_key )
{
struct pretree_node *walk;
struct pretree_node **save;
int dir;
/*
* Search for a matching link
* Return failure if not found
*/
save = &tree->header;
walk = tree->header;
for ( ; ; ) {
if ( walk == NULL )
return 0;
else if ( walk->data == search_key )
break;
dir = walk->data < search_key;
save = &walk->link[dir];
walk = walk->link[dir];
}
/*
* Remove the old item
*/
if ( walk->link[1] == NULL || walk->link[0] == NULL ) {
dir = walk->link[0] == NULL;
/* Fix parent pointers */
if ( walk->link[dir] != NULL ) {
walk->link[dir]->parent = (*save)->parent;
if ( walk->link[dir]->link[dir] != NULL )
walk->link[dir]->link[dir]->parent = *save;
if ( walk->link[dir]->link[!dir] != NULL )
walk->link[dir]->link[!dir]->parent = *save;
}
*save = walk->link[dir];
*old_item = walk;
}
else {
struct pretree_node *heir = walk->link[1];
struct pretree_node *prev = walk;
/* Find walk's successor */
while ( heir->link[0] != NULL ) {
prev = heir;
heir = heir->link[0];
}
walk->data = heir->data;
prev->link[prev == walk] = heir->link[1];
/* Fix parent pointers */
if ( heir->link[1] != NULL )
heir->link[1]->parent = prev;
*old_item = heir;
}
tree->size--;
return 1;
}
As we saw in Source 17, parent pointers can be useful, but they are also difficult to maintain properly. As an example of the advantages and disadvantages of parent pointers, we will now look at the concept of root insertions into a binary search tree. Root Insertion Inserting a new node at the root of a tree at first seems like an exercise in the trivial, but it isn't as easy as one might think at first because there may be a node in one of the subtrees that simply splicing into the root would violate the first rule of a binary search tree, that every node in the left subtree is less than the item and every node in the right subtree is greater than the item. To properly insert a node at the root of a tree, we need to consider the concept of rotating nodes in a binary search tree. The concept is simple, the code is simple, but in practice it can be frustrating (Trust me, I have the scars to prove it. ;-)). There are two types of rotations, left and right. When rotating left, the right node becomes the parent of its parent and the parent becomes the left subtree of the right node: Code: Diagram 9 (left rotation where 1 is C and 7 is B):
C->right = B->left;
B->left = C;
1 7
/ \ / \
0 7 --> 1 8
/ \ / \
4 8 0 4
Code: Diagram 10 (right rotation):
C->left = B->right;
B->right = C;
7 1
/ \ / \
1 8 --> 0 7
/ \ / \
0 4 4 8
Here is an idea, why don't we use our tree implementation with parent pointers? This is a good example of the power of links to the parent, but as described earlier, the maintenance can be tricky, far more than a stack based approach: Code: Source 18:
int pretree_insert (
struct pretree *tree,
struct pretree_node *new_item,
int search_key )
{
struct pretree_node *walk;
int dir;
/*
* Handle empty tree case
*/
if ( tree->header == NULL ) {
tree->header = new_item;
tree->header->parent = NULL;
return 1;
}
/*
* Search for an empty link
* Return failure if a duplicate is found
*/
walk = tree->header;
for ( ; ; ) {
if ( walk->data == search_key )
return 0;
dir = walk->data < search_key;
if ( walk->link[dir] == NULL )
break;
walk = walk->link[dir];
}
/*
* Insert the new item
*/
walk->link[dir] = new_item;
new_item->parent = walk;
tree->size++;
/*
* Walk back up and rotate
*/
while ( new_item != tree->header ) {
if ( new_item == walk->link[0] ) {
/* Rotate right */
walk->link[0] = new_item->link[1];
new_item->link[1] = walk;
}
else {
/* Rotate left */
walk->link[1] = new_item->link[0];
new_item->link[0] = walk;
}
/* Notify the rest of the tree of changes */
if ( walk->parent != NULL ) {
dir = ( walk == walk->parent->link[1] );
walk->parent->link[dir] = new_item;
}
/* Fix parent pointers */
new_item->parent = walk->parent;
walk->parent = new_item;
if ( walk->link[0] != NULL )
walk->link[0]->parent = walk;
if ( walk->link[1] != NULL )
walk->link[1]->parent = walk;
/* Reseat header if needed */
if ( tree->header == walk )
tree->header = new_item;
/* Move up and start over */
walk = new_item->parent;
}
return 1;
}
As a counterpoint for the use of parent pointers, a stack can be used instead of parent pointers to simplify the logic, and instead of recursion for efficiency and safety reasons. Without the parent maintenance, the logic of root insertion becomes obvious: Code: Source 19:
int pretree_insert (
struct pretree *tree,
struct pretree_node *new_item,
int search_key )
{
struct pretree_node *walk;
struct pretree_node *up[50];
int dir;
int top = 0;
/*
* Handle empty tree case
*/
if ( tree->header == NULL ) {
tree->header = new_item;
return 1;
}
/*
* Search for an empty link
* Return failure if a duplicate is found
*/
walk = tree->header;
for ( ; ; ) {
if ( walk->data == search_key )
return 0;
dir = walk->data < search_key;
if ( walk->link[dir] == NULL )
break;
up[top++] = walk;
walk = walk->link[dir];
}
/*
* Insert the new item
*/
walk->link[dir] = new_item;
tree->size++;
/*
* Walk back up and rotate
*/
while ( new_item != tree->header ) {
if ( new_item == walk->link[0] ) {
/* Rotate right */
walk->link[0] = new_item->link[1];
new_item->link[1] = walk;
}
else {
/* Rotate left */
walk->link[1] = new_item->link[0];
new_item->link[0] = walk;
}
/* Notify the rest of the tree of changes */
if ( up[top - 1] != NULL ) {
dir = ( walk == up[top - 1]->link[1] );
up[top - 1]->link[dir] = new_item;
}
/* Reseat header if needed */
if ( tree->header == walk )
tree->header = new_item;
/* Move up and start over */
walk = up[--top];
}
return 1;
}
Conclusion Binary search trees are an excellent data structure for efficient insertion, deletion, and searching operations without a great deal of complexity. The average case for a binary search tree created with random values is logarithmic (O(logN)), which is quite good. Unfortunately, the worst case of binary search trees is linear (O(N)), but there are ways to modify the insertion, deletion, and search algorithms to guarantee logarithmic performance. Such techniques are far more complex and will be covered in future installments of this tutorial. Exercises Exercise 1: In our recursive insertion function (Source 6), I noted that the size of the tree would increase even if a duplicate value was found and ignored. This would cause the actual size of the tree to differ from the recorded size. Rewrite the two functions to fix this. Exercise 2: In Source 7, I used an array based queue to implement level-order traversal. Verify that this queue does what it is supposed to. If it does not, rewrite the function to fix it. Exercise 3: Using the code developed in Exercise 2, determine any cases where the function would fail unexpectedly. Rewrite the function to fix any problems found. Exercise 4: Verify that Source 9 works as expected. Exercise 5: Explain the similarities between level-order traversal and preorder traversal that they can be implemented exactly the same iteratively with only the difference between use of a stack and use of a queue. Exercise 6: Show the contents of the stack during the execution of Source 10. Exercise 7: Describe any flaws in the implementation of Source 10, make changes as required and verify that the algorithm still works. Exercise 8: Write a function that implements postorder traversal iteratively. Verify that it is correct. Exercies 9: Write an insertion function for a threaded tree. Exercise 10: Write a C/C++ function to search a threaded tree in preorder, inorder, and postorder. Exercise 11: Verify that following a node's left link all the way to the right results in finding the predecessor. Does this algorithm also find a node's successor by following the right link all the way to the left? Why? Exercise 12: Verify that the algorithm described in Diagrams 6 and 7 works as expected. Exercise 13: Analyze Source 12 and diagram its execution on Diagram 4. Can this algorithm be improved? If so, determine how and make the necessary modifications. Exercise 14: In Source 13, the first two cases handle a node with a single link. Determine if these two cases can also handle a leaf node and explain how. Exercise 15: Analyze the execution of Source 14 and verify that it is indeed an improvement over deletion by merging. Exercise 16: To solve the problem of asymmetric deletion by copying, write a function that alternates between replacing the successor and replacing the predecessor. Verify that this is an improvement, and explain why. Exercise 17: Can Source 17 be improved? If so, rewrite the function to do so. Exercise 18: Verify that simply splicing a new node at the root may violate the definition of a binary search tree. Exercise 19: Write a recursive function to insert an item at the root of a binary search tree. Exercise 20: Compare and contrast Source 19 and Source 18. Which is the better solution for a general binary search tree library?
__________________ My best code is written with the delete key. |
| Prelude is offline |
| | #2 |
| Registered User Join Date: Jul 2003
Posts: 450
| I compliment you on this informative post and hope it is stickied. I am currently implementing a binary tree and have posted several questions regarding basic syntax I'll need for my implementation. I did not read the entire post yet, because I want to discover from my own work, but will read further after I attempt traversal and then searching algorithims. You show a great deal of initiative in helping inform us more inexperienced programmers educate ourselves. This will simplify research. For instance I currently have three books open regarding the subject and this coallates the information. Again I think you and will bookmark this post. The funny thing is I will not be studying data structures or even C/C++ at the university until next year, so will have a tremendous leap compared to the rest of the students. |
| curlious is offline |
| | #3 |
| Code Goddess Join Date: Sep 2001
Posts: 9,661
| >For instance I currently have three books open regarding the subject When I was first learning binary search trees and balancing techniques, I ended up buying and burning hard-stare holes in eleven data structure books. I highly recommend getting as many different perspectives as you can. Everyone describes things differently, and usually you have to hear three or four of them to really "get" it. As for different perspectives, not you have another to add to the mix. ![]() >The funny thing is I will not be studying data structures or even C/C++ at the university until next year Who cares? Data structures are a blast! You don't have to take a class to learn fun new things.
__________________ My best code is written with the delete key. |
| Prelude is offline |
| | #4 | |
| ¡Amo fútbol! Join Date: Dec 2001
Posts: 2,121
| Quote:
NOTE TO SELF: Prelude is on LSD again. Last edited by golfinguy4; 12-31-2003 at 01:56 PM. | |
| golfinguy4 is offline |
| | #5 |
| Registered User Join Date: Mar 2002
Posts: 346
| Very nice tutorial. It seems to explain everything well and is easy to follow. - Sean
__________________ If cities were built like software is built, the first woodpecker to come along would level civilization. Black Frog Studios |
| sean345 is offline |
| | #6 |
| carry on Join Date: Feb 2003 Location: Seattle, WA
Posts: 1,971
| Didn't quite get through reading the whole thing but it was pretty good. I would prefer C++, though
__________________ "Think not but that I know these things; or think I know them not: not therefore am I short Of knowing what I ought." -John Milton, Paradise Regained (1671) "Work hard and it might happen." -XSquared |
| JaWiB is offline |
| | #7 |
| l'Anziano Join Date: Aug 2001
Posts: 2,573
| well it is already in C, so it should compile for any C++ compiler. Excellent tutorial. I already know binary search trees (and have known them for several years), so I just scanned through it to see what topics it covered. You seemed to cover quite a lot of topics, and by the looks of it you explained everything with diagrams and code very well. Good job. I also see that you covered more traversals than just the pre-order, post-order, and in-order traversals. Most people don't go beyond those 3. Good job. |
| DavidP is offline |
| | #8 |
| Code Goddess Join Date: Sep 2001
Posts: 9,661
| >Prelude is on LSD again. If I am, it's because my bosses slipped it into my food. They like when I have fun writing code. ![]() >I would prefer C++, though I prefer not to use proper C++ (ie. a good container class) for data structure tutorials. The scaffolding of the class tends to obscure the logic of the data structure. This makes things harder to follow in general, a problem that simple functions don't have. Then of course there is the advantage of catering to both C and C++ users. Anyone familiar with C++ and the desire to do so can encapsulate the functions into a container class with little difficulty. >You seemed to cover quite a lot of topics I realized earlier today that I forgot to mention the problem of using deletion by copying when the data items are very large. I'll add that when I have a chance. ![]() >Most people don't go beyond those 3. An annoying tendency to be sure. I got complaints about a previous binary search tree tutorial because it only had the three. I made sure not to make the same mistake twice after I found out how sharp my readers are.
__________________ My best code is written with the delete key. |
| Prelude is offline |
| | #9 |
| Registered User Join Date: Jul 2003
Posts: 450
| This post is not to be digested in one sitting. I have read up to the point where you implement the iterative algorithim of the Depth first inorder traversal routine. I am going to have to sit down and draw out what is happening before I fully understand it. I thought I would just be able to read the rest of the post at this point, but now realize I will have to do some work before I fill confident. A little better explanation of the logic behind the iterative approach of the inorder routine would have been nice, but this would be a redundant explanation and giving everything to the reader encourages a lazy coder! |
| curlious is offline |
| | #10 |
| Code Goddess Join Date: Sep 2001
Posts: 9,661
| >A little better explanation of the logic behind the iterative approach of the inorder routine would have been nice In my experience, no amount of explanation will truly drive the point home. You have to work with the logic yourself to understand completely. Explanation just guides you on the big picture.
__________________ My best code is written with the delete key. |
| Prelude is offline |
| | #11 |
| C++ Developer Join Date: Jun 2002 Location: UWaterloo
Posts: 2,718
| From Source 3, >> 17. dir = walk->data < search_key; >> 18. walk = walk->link[dir]; Are you sure that dir is guaranteed to be 1 or 0? I thought that true is defined as anything but 0. In Diagram 8 (Deletion), you say remove 5, but you proceed to remove the 7. Also, it'd be nice if you provided an example where both of the subtrees of the node to be deleted had a left and a right subtree, to demonstrate what happens then.
__________________ Naturally I didn't feel inspired enough to read all the links for you, since I already slaved away for long hours under a blistering sun pressing the search button after typing four whole words! - Quzah You. Fetch me my copy of the Wall Street Journal. You two, fight to the death - Stewie |
| XSquared is offline |
| | #12 | |
| Just Lurking Join Date: Oct 2002
Posts: 4,990
| >Are you sure that dir is guaranteed to be 1 or 0? Quote:
__________________ 7. It is easier to write an incorrect program than understand a correct one. 40. There are two ways to write error-free programs; only the third one works.* | |
| Dave_Sinkula is offline |
| | #13 |
| C++ Developer Join Date: Jun 2002 Location: UWaterloo
Posts: 2,718
| OK, cool. I wasn't sure about that, but now I am. Thanks. BTW, does this apply to C++ too?
__________________ Naturally I didn't feel inspired enough to read all the links for you, since I already slaved away for long hours under a blistering sun pressing the search button after typing four whole words! - Quzah You. Fetch me my copy of the Wall Street Journal. You two, fight to the death - Stewie |
| XSquared is offline |
| | #14 |
| Code Goddess Join Date: Sep 2001
Posts: 9,661
| >Are you sure that dir is guaranteed to be 1 or 0? Quite. ![]() >In Diagram 8 (Deletion), you say remove 5, but you proceed to remove the 7. I sure did, didn't I? I imagine that would cause some confusion, and probably some really strange code for those that followed the diagrams exactly. ![]() >Also, it'd be nice if you provided an example where both of the subtrees of the node to be deleted had a left and a right subtree, to demonstrate what happens then. Good idea. >BTW, does this apply to C++ too? It does indeed.
__________________ My best code is written with the delete key. |
| Prelude is offline |
| | #15 |
| 5|-|1+|-|34|) Join Date: Aug 2001
Posts: 4,429
| Maybe this would be most helpful if added to the FAQ or help section of the site? |
| ober is offline |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Trouble with client GET request path | NuNn | C Programming | 1 | 02-25-2009 03:34 PM |
| Can you check what is wrong with this code | Ron | C++ Programming | 4 | 08-01-2008 10:59 PM |
| my HTTP request handler code correct? | George2 | C# Programming | 0 | 04-25-2008 04:01 AM |
| denied request | Unregistered | C++ Programming | 1 | 09-20-2001 11:35 PM |