Thread: BST - Insertion Difficulties

  1. #1
    Registered User
    Join Date
    Feb 2017
    Posts
    27

    BST - Insertion Difficulties

    Hi everyone,

    I've been assigned an implementation of a BST using arrays & pointers. This makes the implementation a lot different than what I've seen from examples online. The header files have been provided to us, listed below:

    data.h
    Code:
    typedef struct {char *key; int data;} Node;
    void print_node(Node node);
    char *key_dup(char *key);
    bst.h
    Code:
    #include "data.h"
    typedef struct {Node *tree_nodes; unsigned char *is_free; int size;} BStree_struct;
    typedef BStree_struct* BStree;
    BStree bstree_ini(int size);
    void bstree_insert(BStree bst, char *key, int data);
    void bstree_traversal(BStree bst);
    void bstree_free(BStree bst);
    I've inserted a few helper functions, but here's my current bst.c:

    bst.c
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include "bst.h"
    
    
    // Input: ’size’: size of an array
    // Output: a pointer of type BStree, i.e. a pointer to an allocated memory of BStree_struct type
    // Effect: dynamically allocate memory of type BStree_struct
    // allocate memory for a Node array of size+1 for member tree_nodes
    // allocate memory for an unsigned char array of size+1 for member is_free
    // set all entries of is_free to 1
    // set member size to ’size’;
    
    
    BStree bstree_ini(int size)
    {
        BStree bst;
        bst = (BStree_struct*)(malloc(sizeof(BStree_struct)));
        bst->tree_nodes = (Node*)(malloc((size+1)*sizeof(Node)));
        bst->is_free = (unsigned char*)(malloc((size + 1) * sizeof(unsigned char)));
        int i;
        for (i = 0; i<= size; i++){
            bst->is_free[i] = '1';
        }
        bst->size = size;
        return bst;
    }
    
    
    // Input: ’bst’: a binary search tree
    // ’key’: a string
    // ’data’: an integer
    // Effect: ’data’ with ’key’ is inserted into ’bst’; if ’key’ is already in ’bst’, do nothing
    
    
    void bstree_insert(BStree bst, char *key, int data)
    {
        int value;
        int index = 1;
        if (bst->is_free[index] == '1')
        {
            bst->tree_nodes[index].key = key_dup(key);
            bst->tree_nodes[index].data = data;
            bst->is_free[index] = '0';
        }
    
    
        else
        {
            value = strcmp(key, bst->tree_nodes[index].key);
            insert_recursion(bst, key, data, value, index);
        }
    }
    
    
    // Input: ’bst’: a binary search tree
    // Effect: print all the nodes in bst using in order traversal
    
    
    void bstree_traversal(BStree bst)
    {
        int index;
        index = 1;
        traversal_recursion(bst, index);
    }
    
    
    // Input: ’bst’: a binary search tree
    // Effect: all memory used by bst are freed
    
    
    void bstree_free(BStree bst)
    {
        free(bst->tree_nodes);
        free(bst->is_free);
        free(bst);
    }
    
    
    // Input: 'bst', 'key', 'data', 'value': the value from strcmp, determines if , 'index': the current index of is_free & tree_nodes
    // Effect: recursively search for an array for the first appropriate branch for the given key, using the strcmp value from bstree_insert
    
    
    void insert_recursion(BStree bst, char *key, int data, int value, int index)
    {
        if(value == 0){ // if the key already exists
            printf("Sorry, this key already exists.");
            return;
        }
    
    
        else if(value < 0){ // if the key is less than the current node
            if(bst->is_free[leftNode(index)] == '1') // if the left node is free, append the node to it
            {
                bst->tree_nodes[leftNode(index)].key = key_dup(key);
                bst->tree_nodes[leftNode(index)].data = data;
                bst->is_free[leftNode(index)] = '0';
            }
            else // else if the node is full, recall the function
            {
                value = strcmp(key, bst->tree_nodes[leftNode(index)].key);
                insert_recursion(bst, key, data, value, index + 1);
            }
        }
    
    
        else if(value > 0){ // if the key is greater than the current node
            if(bst->is_free[rightNode(index)] == '1') // if the right node is free, append the node to it
            {
                bst->tree_nodes[rightNode(index)].key = key_dup(key);
                bst->tree_nodes[rightNode(index)].data = data;
                bst->is_free[rightNode(index)] = '0';
            }
            else // else if the node is full, recall the function
            {
                value = strcmp(key, bst->tree_nodes[rightNode(index)].key);
                insert_recursion(bst, key, data, value, index + 1);
            }
        }
    }
    
    
    // Input: bst, index
    // Effect: print_node for each node inorder traversal
    
    
    void traversal_recursion(BStree bst, int index)
    {
        if(bst->is_free[index] == '0') // if index is full
        {
            traversal_recursion(bst, leftNode(index)); // traverse the left nodes
            print_node(bst->tree_nodes[index]); // print the node
            traversal_recursion(bst, rightNode(index)); // traverse the right nodes
        }
    }
    
    
    // Input: index of the node
    // Effect: returns the index of the node left of index
    int leftNode(int index){
        return (index * 2);
    }
    
    
    // Input: index of the node
    // Effect: returns the index of the node right of index
    int rightNode(int index){
        return ((index * 2) + 1);
    }
    Here's a sample main program of 9 insertions, then a traversal & free:

    Code:
    int main(void) {
    BStree bst;
    bst = bstree_ini(1000);
    4
    bstree_insert(bst, "Once", 1);
    bstree_insert(bst, "Upon", 22);
    bstree_insert(bst, "a", 3);
    bstree_insert(bst, "Time", 4);
    bstree_insert(bst, "is", 5);
    bstree_insert(bst, "filmed", 6);
    bstree_insert(bst, "in", 7);
    bstree_insert(bst, "Vancouver", 8);
    bstree_insert(bst, "!", 99);
    bstree_traversal(bst);
    bstree_free(bst);
    }
    However, here's the comparison between the expected display and what my program actually outputs (respectively):

    ! 99 | filmed 6
    Once 1 | Time 4
    Time 4 | ! 99
    Upon 22 | a 3
    Vancouver 8 | in 7
    a 3 | Once 1
    filmed 6 | Vancouver 8
    in 7 | Upon 22
    is 5 | is 5


    My program seems to be failing to insert correctly. The only left node should be !, since ASCII values have '!' being the only one less than Once. My program for some reason is inserting 'filmed' & 'Time' keys to the to the left node of '!' however, and I haven't been able to figure out why.

    Sorry if the code is excessive, and for my formatting in advance. Please let me know if there's anything I can do to make it easier to read.

    Thanks!

  2. #2
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    Firstly, your teacher has given you a stupid exercise. Although it is reasonable to store nodes in an array in this way for a "complete" binary tree (as used in a heapsort, for example), that is not the type of tree you are generating here.

    So you definitely need to check the bounds of your indices since they could easily go above bst->size. Such checks should be added at the top of both of your ..._recursion functions. In insert_recursion, going out of bounds is an error. In traversal_recursion, it indicates that there are no more nodes in that branch.

    The idea of storing the tree in an array is that you can find the left and right child without storing pointers. The array indices are assigned like this:
    Code:
    .                  1
                   2       3
                 4   5   6   7
              etc.
    (They could have started at 0, but in your program 0 is unused, which somewhat simplifies the leftNode, rightNode functions.)

    Your problems stem from the insertion routines. bstree_insert is badly written. I see no reason to treat position 1 as special and deal with it in that function. Also, passing the return value of a strcmp into insert_recursion is totally unecessary. You need to rewrite those two functions. bstree_insert should just be:
    Code:
    void bstree_insert(BStree bst, char *key, int data) {
        insert_recursion(bst, key, data, 1);
    }
    The correct code for insert_recursion is much simpler than what you currently have. It's prototype should be:
    Code:
    void insert_recursion(BStree bst, char *key, int data, int index);
    Other points:

    You're missing prototypes for some functions in bst.c: traversal_recursion, insert_recursion, leftNode, rightNode. And the functions key_dup and print_node don't exist at all in the code you've shown.

    Also, you should always check that malloc calls succeed, and the return values don't need to be cast (removing the casts is a good check to see if you are accidentally compiling as C++ instead of C which happens quite a bit to beginners using MSVC).

    It's dumb to use '1' and '0' in the is_free array, which completely destroys any natural use of it as a boolean array. You should store an actual 1 and 0 instead. Then you can say the more natural:
    Code:
    if (bst->is_free[index]) {
    }
    
    // instead of
    
    if (bst->is_free[index] == '1') {
    }

  3. #3
    Registered User
    Join Date
    Feb 2017
    Posts
    27
    @algorism

    Looking at the simplicity of basic BST structure from other examples online, I 100% agree this is completely impractical. The implementation using left & right node struct is much simpler. The use of is_free stored as a char is unfortunately forced upon us as well.

    I will fix my insertion and repost my fixed function later tonight or tomorrow. The treating of index 1 as a special case was a holdover from me trying to do the recursion in the insert program without a helper, before I realized it would be necessary to input the index into my function.

    Thanks so much for the detailed response, I will try to clean up my code ASAP.

  4. #4
    Registered User
    Join Date
    Feb 2017
    Posts
    27
    Hey everyone,

    Fixed my BST structure! The output is now as expected, and I added malloc call checks & out of bound checks. Thanks a lot algorism! Hopefully everything looks better.

    Here's my code as it stands:

    data.h
    Code:
    typedef struct {char *key; int data;} Node;
    void print_node(Node node);
    char *key_dup(char *key);
    data.c
    Code:
    #include <stdio.h>
    #include <string.h>
    #include "data.h"
    
    
    // Input: ’node’: a node
    // Effect: node.key is printed and then the node.data is printed
    void print_node(Node node)
    {
        printf("%s, %d \n", node.key, node.data);
    }
    
    
    // Input: ’key’: a string ends with ’\0’
    // Output: a pointer of type pointer to char, pointing to an allocated memory containing a copy of ’key’
    // Effect: dynamically allocate memory to hold a copy of ’key’
    char *key_dup(char *key)
    {
        char *duplicate = NULL;
        if ((duplicate = malloc(strlen(key)+1)) != NULL)
            strcpy(duplicate, key);
        return duplicate;
    }
    bst.h
    Code:
    #include "data.h"
    
    
    typedef struct {Node *tree_nodes; unsigned char *is_free; int size;} BStree_struct;
    typedef BStree_struct* BStree;
    BStree bstree_ini(int size);
    void bstree_insert(BStree bst, char *key, int data);
    void bstree_traversal(BStree bst);
    void bstree_free(BStree bst);
    void insert_recursion(BStree bst, char *key, int data, int index);
    void traversal_recursion(BStree bst, int index);
    int leftNode(int index);
    int rightNode(int index);
    bst.c
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include "bst.h"
    
    
    // Input: ’size’: size of an array
    // Output: a pointer of type BStree, i.e. a pointer to an allocated memory of BStree_struct type
    // Effect: dynamically allocate memory of type BStree_struct
    // allocate memory for a Node array of size+1 for member tree_nodes
    // allocate memory for an unsigned char array of size+1 for member is_free
    // set all entries of is_free to 1
    // set member size to ’size’;
    
    
    BStree bstree_ini(int size)
    {
        BStree bst;
        bst = (BStree_struct*)(malloc(sizeof(BStree_struct)));
        if(bst == NULL){
            printf("Error with memory allocation of the tree.");
            exit(1);
        }
        bst->tree_nodes = (Node*)(malloc((size+1)*sizeof(Node)));
        if(bst->tree_nodes == NULL){
            printf("Error with memory allocation of nodes.");
            exit(1);
        }
        bst->is_free = (unsigned char*)(malloc((size + 1) * sizeof(unsigned char)));
        if(bst->is_free == NULL){
            printf("Error with memory allocation of the free array.");
            exit(1);
        }
        int i;
        for (i = 0; i<= size; i++){
            bst->is_free[i] = '1';
        }
        bst->size = size;
        return bst;
    }
    
    
    // Input: ’bst’: a binary search tree
    // ’key’: a string
    // ’data’: an integer
    // Effect: ’data’ with ’key’ is inserted into ’bst’; if ’key’ is already in ’bst’, do nothing
    
    
    void bstree_insert(BStree bst, char *key, int data)
    {
        insert_recursion(bst, key, data, 1);
    }
    
    
    // Input: ’bst’: a binary search tree
    // Effect: print all the nodes in bst using in order traversal
    
    
    void bstree_traversal(BStree bst)
    {
        int index;
        index = 1;
        traversal_recursion(bst, index);
    }
    
    
    // Input: ’bst’: a binary search tree
    // Effect: all memory used by bst are freed
    
    
    void bstree_free(BStree bst)
    {
        free(bst->tree_nodes);
        free(bst->is_free);
        free(bst);
    }
    
    
    // Input: 'bst', 'key', 'data', 'value': the value from strcmp, determines if , 'index': the current index of is_free & tree_nodes
    // Effect: recursively search for an array for the first appropriate branch for the given key, using the strcmp value from bstree_insert
    
    
    void insert_recursion(BStree bst, char *key, int data, int index)
    {
        int value;
        if(bst->is_free[1] == '1') // base case for root
        {
            bst->tree_nodes[1].key = key_dup(key);
            bst->tree_nodes[1].data = data;
            bst->is_free[1] = '0';
            printf("Root insertion: %s @ 1 \n", key);
        }
    
    
        else // else if root is full
        {
            if(bst->is_free[index] == '0') // set value equal to strcmp key with current node's key
            {
                value = strcmp(key, bst->tree_nodes[index].key);
            }
    
    
            if(value == 0) // if the key already exists
            {
                printf("Sorry, this key already exists in the tree.");
                return;
            }
    
    
            else if(value < 0)
            {
                if(bst->is_free[leftNode(index)] == '1') // if the left node is free, insert node here
                {
                    bst->tree_nodes[leftNode(index)].key = key_dup(key);
                    bst->tree_nodes[leftNode(index)].data = data;
                    bst->is_free[leftNode(index)] = '0';
                    printf("Left node insertion: %s @ %d \n", key, leftNode(index));
                }
                else // else if the node is full, recall the function
                    if (index + 1 > bst->size)
                    {
                        printf("Out of bounds error.  Please use a larger tree size.");
                        exit(1);
                    }
                    else
                        insert_recursion(bst, key, data, leftNode(index));
            }
    
    
            else if(value > 0)
            {
                if(bst->is_free[rightNode(index)] == '1') // if the right node is free, insert node here
                {
                    bst->tree_nodes[rightNode(index)].key = key_dup(key);
                    bst->tree_nodes[rightNode(index)].data = data;
                    bst->is_free[rightNode(index)] = '0';
                    printf("Right node insertion: %s @ %d \n", key, rightNode(index));
                }
                else // else if the node is full, recall the function
                    if (index + 1 > bst->size)
                    {
                        printf("Out of bounds error.  Please use a larger tree size.");
                        exit(1);
                    }
                    else
                        insert_recursion(bst, key, data, rightNode(index));
            }
        }
    }
    
    
    // Input: bst, index
    // Effect: print_node for each node inorder traversal
    
    
    void traversal_recursion(BStree bst, int index)
    {
        if(bst->is_free[index] == '0') // if index is full
        {
            traversal_recursion(bst, leftNode(index)); // traverse the left nodes
            print_node(bst->tree_nodes[index]); // print the node
            traversal_recursion(bst, rightNode(index)); // traverse the right nodes
        }
    }
    
    
    // Input: index of the node
    // Effect: returns the index of the node left of index
    int leftNode(int index){
        return (index * 2);
    }
    
    
    // Input: index of the node
    // Effect: returns the index of the node right of index
    int rightNode(int index){
        return ((index * 2) + 1);
    }
    I'm configuring my main.c right now to accept user input for nodes to add (key & data pairs).

  5. #5
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    bstree_traversal doesn't need the local variable:
    Code:
    void bstree_traversal(BStree bst)
    {
        traversal_recursion(bst, 1);
    }
    You aren't checking for failure of key_dup. It returns NULL if its malloc call fails. I would make a create_node function:
    Code:
    void die(const char *msg)
    {
        fprintf(stderr, "Error: %s\n", msg);
        exit(EXIT_FAILURE);
    }
    
    void create_node(BStree bst, char *key, int data, int index)
    {
        if ((bst->tree_nodes[index].key = key_dup(key)) == NULL)
            die("create_node");
        bst->tree_nodes[index].data = data;
        bst->is_free[index] = '0';
    }
    You are still confused about insert_recursion and making it WAY more complicated than necessary. You certainly don't need a special "base case" for the tree root! Remember that it is initially called with an index of 1. So all you need is:
    Code:
    void insert_recursion(BStree bst, char *key, int data, int index)
    {
        if (index > bst->size)  //// not index+1 (since the array was created with a size of size+1)
            die("insert_recursion");
        else if (bst->is_free[index] == '1')
            create_node(bst, key, data, index);
        else
        {
            int cmp = strcmp(key, bst->tree_nodes[index].key);
            if (cmp == 0)
                printf("Key {%s} already exists.\n", key);
            else if (cmp > 0)
                insert_recursion(bst, key, data, rightNode(index));
            else
                insert_recursion(bst, key, data, leftNode(index));
        }
    }

  6. #6
    Registered User
    Join Date
    Feb 2017
    Posts
    27
    Hey algorism,

    Thanks again for helping, I'll go through and do my best to fix those on my own.

    In the meantime, I've been working on my main function, which is supposed to read from stdin (or redirect from a file) the pairs of strings and ints used for the key & data (string followed by int, one per line).

    Having looked stuff up, I decided to try to implement a while loop with a scanf condition, but my implementation is clearly incorrect given running it crashes the program once the loop is entered.

    main.c
    Code:
    #include <stdio.h>
    #include "bst.h"
    
    
    int main(void)
    {
        BStree bst;
        int size;
        char quit;
        char *str;
        int num;
        printf("Please enter the size of the tree: ");
        scanf("%d", &size);
        bst = bstree_ini(size);
        printf("Please enter the first key (str) & data (int) you wish to enter, separated by whitespace: ");
        while ((scanf(" %s %d", str, &num)) == 2) {
            bstree_insert(bst, *str, num);
            printf("Please enter Q/q if you wish to stop entering, else continue: ");
            scanf(" %c", &quit);
            if(quit == 'Q' || quit == 'q')
                break;
            printf("Please enter the new key (str) then the data (int): ");
            scanf("%s %d", str, &num);
        }
        bstree_traversal(bst);
        bstree_free(bst);
    }
    I'm unsure if there's some logic I'm missing, or if my implementation is completely wrong. Gonna keep on trucking though, thanks again for all the help <3

  7. #7
    Registered User
    Join Date
    Feb 2017
    Posts
    27
    Fixed, here's my current main.c:

    Code:
    #include <stdio.h>
    #include "bst.h"
    
    
    
    
    int main(void)
    {
        BStree bst;
        int size, num;
        char quit, *str[256];
        printf("Please enter the size of the tree: ");
        scanf("%d", &size);
        bst = bstree_ini(size);
        printf("Please enter the first key (str) & data (int) you wish to enter, separated by whitespace: ");
        while ((scanf("%s %d", str, &num)) == 2) {
            bstree_insert(bst, str, num);
            printf("Please enter Q/q if you wish to stop entering, else continue: ");
            scanf(" %c", &quit);
            if(quit == 'Q' || quit == 'q')
                break;
            printf("Please enter the new key (str) then the data (int): ");
        }
        bstree_traversal(bst);
        bstree_free(bst);
    }
    It correctly reads in user string + int pairs, and inserts them into the nodes (which is then inserted into the tree correctly, etc.).

    My final question is, is this implementation compatible with file redirection? My current implementation is messy I feel, since I reprompt the user each time if they want to quit.
    Last edited by kapkong; 03-17-2017 at 02:46 PM.

  8. #8
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    str shouldn't be an array of pointers, just a regular string: char str[256]. Your compiler should warn you about that. Remember to always maximize your warning level and don't ignore the warnings.

    Prompting is not generally compatible with redirection. However, your teacher is an idiot judging by this assignment (and not telling you to maximize compiler warnings). But obviously you'll have to get rid of the "do you want to quit" prompt. To quit when entering from the keyboard, you enter the end-of-file key combination which is ctrl-d (or ctrl-z if you use Windows).

  9. #9
    Registered User
    Join Date
    Feb 2017
    Posts
    27
    @algorism,

    Once again, thank you so much for all the help! I didn't know about the Ctrl-D command (yet another deficiency in my prof's curriculum), which makes things a lot simpler.

    And yeah, this prof is not particularly well-liked by students due to his ridiculous assignments relative to how little material his lectures cover. I'm already dreading the future course I have with him (thankfully just one other).

    You're a lifesaver <3

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. strtok difficulties
    By joatmon in forum C Programming
    Replies: 4
    Last Post: 09-03-2013, 03:49 PM
  2. difficulties with UNIX
    By tytelizgal in forum Linux Programming
    Replies: 1
    Last Post: 10-18-2008, 04:29 AM
  3. Help! Pointer difficulties.
    By Tyrant in forum C Programming
    Replies: 16
    Last Post: 11-17-2007, 01:59 PM

Tags for this Thread