Thread: Binary Search Tree sort

  1. #16
    Registered User
    Join Date
    Dec 2017
    Posts
    24
    Quote Originally Posted by jimblumberg View Post
    What instructions? IMO, that typedef is horrible and all it really does is aid in confusion.


    What examples are you talking about, I don't see anything "below"?

    The following is probably a good reason using that typdef is confusing:

    Code:
    void merge(ARRAY A, int left, int middle, int right){
        ARRAY TEMP;
    ...
    What is the size of the TEMP array?

    Why is it that size?



    If you don't understand dynamic memory then perhaps you shouldn't be using such large numbers for your array sizes.
    *above sorry, I mistook because I'm not native english so I don't speak english the best way.
    Don't be so agressive, I HAD to use typedef because I HAVE instructions in my homework and I HAVE to use such large arrays. What don't you understand in this ?

    I didn't say I don't understand dynamic memory, I just said I'm not used to use them

  2. #17
    Registered User
    Join Date
    Dec 2017
    Posts
    24
    Quote Originally Posted by jimblumberg View Post
    What instructions? IMO, that typedef is horrible and all it really does is aid in confusion.


    What examples are you talking about, I don't see anything "below"?

    The following is probably a good reason using that typdef is confusing:

    Code:
    void merge(ARRAY A, int left, int middle, int right){
        ARRAY TEMP;
    ...
    What is the size of the TEMP array?

    Why is it that size?



    If you don't understand dynamic memory then perhaps you shouldn't be using such large numbers for your array sizes.
    By the way, there is little evidence that TEMP has the size defined by his type ARRAY : N elements !

  3. #18
    Registered User
    Join Date
    Dec 2017
    Posts
    24
    Quote Originally Posted by john.c View Post
    The easiest way to allow the allocation of a large object is to allocate the large object statically.
    Code:
    int main(void) {
        static ARRAY a;
    Dynamic allocation is better, though, since it allows you to let the user choose the size. Get rid of the ARRAY typedef and just accept the array as an int* in functions.
    Code:
    int main(int argc, char **argv) {
        if (argc != 2) {
            fprintf(stderr, "Usage: treesort ARRAYSIZE\n");
            exit(EXIT_FAILURE);
        }
    
        int size = atoi(argv[1]);
        if (size < 1) {
            fprintf(stderr, "Error: ARRAYSIZE must be > 0\n");
            exit(EXIT_FAILURE);
        }
    
        int *a = malloc(size * sizeof *a);
        if (a == NULL) {
            perror("Error: malloc");
            exit(EXIT_FAILURE);
        }
    
        // ...
    
        free(a);
        return 0;
    }
    One more time, Thank you very much for helping me john.c. You are a pearl !

  4. #19
    Registered User
    Join Date
    May 2010
    Posts
    4,632
    I HAD to use typedef because I HAVE instructions in my homework and I HAVE to use such large arrays.
    What does using that horrible typedef have to do about using such large arrays?

    By the way, there is little evidence that TEMP has the size defined by his type ARRAY : N elements !
    What? There is a lot of evidence that TEMP is defined with a size of N, which is probably about twice as large as it needs to be.

  5. #20
    Registered User
    Join Date
    Dec 2017
    Posts
    24
    Quote Originally Posted by jimblumberg View Post
    What does using that horrible typedef have to do about using such large arrays?


    What? There is a lot of evidence that TEMP is defined with a size of N, which is probably about twice as large as it needs to be.
    So, if there is no correlation between them, why should I get rid of it ?

    +It is quite normal that TEMP has the same size as ARRAY, as TEMP is used to fill ARRAY in the merge function. --> I already tried to reduce it but the algorithm won't fullfill the desired purpose.
    The merge algorithm I wrote is as clear as possible and has just what it needs where it is needed.

  6. #21
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by ElPosto View Post
    TEMP has the same size as ARRAY
    Modified example where the temp array allocation is only as large as it needs to be:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    void merge(int a[], int left, int middle, int right){
        /* only allocate what is needed */
        int * tmp = malloc((right+1-left)*sizeof(int));
        int i = left, j = middle + 1, k = 0;
        while(i <= middle && j <= right){
            if(a[i] <= a[j])
                tmp[k++] = a[i++];
            else
                tmp[k++] = a[j++];
        }
        while(i <= middle)
            tmp[k++] = a[i++];
        while(j <= right)
            tmp[k++] = a[j++];
        for(i = left; i <= right; i++)
            a[i] = tmp[i-left];        /* this handles variable size tmp[] */
        free(tmp);
    }
    
    
    void merge_sort_gen(int a[], int left, int right){
        int middle;
        if(left < right){
            middle = (left + right) / 2;
            merge_sort_gen(a, left, middle);
            merge_sort_gen(a, middle + 1, right);
            merge(a, left, middle, right);
        }
    }
    
    void merge_sort(int a[], int n){
        merge_sort_gen(a, 0, n-1);
    }
    
    #define N (1024*1024)
    
    int main(void){
        int *a = malloc(N * sizeof(int));
        int i;
        for(i = 0; i < N; i++)
            a[i] = rand();
        merge_sort(a, N);
        for(i = 1; i < N; i++){
            if(a[i-1] > a[i]){
                printf("sort failed\n");
                break;
            }
        }
        if(i == N)
            printf("sort passed\n");
        free(a);
        return 0;
    }

  7. #22
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    A somewhat optimized top down merge sort that does a one time allocation and uses two mutually recursive routines to change the direction of merge based on level of recursion instead of copying data in the merge function. This version is a bit over twice as fast as the example in my prior post. The right index in this example is the index to the end of the array == index to last element + 1.

    As for the original post, I'll add an example of using a linked list merge sort and converting to binary search tree later if it turns out to be faster.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    void SplitMergeAtoA(int a[], int b[], size_t ll, size_t ee);
    void SplitMergeAtoB(int a[], int b[], size_t ll, size_t ee);
    void Merge(int a[], int b[], size_t ll, size_t rr, size_t ee);
    
    void MergeSort(int a[], size_t n)       // entry function
    {
    int *b;
        if(n < 2)                           // if size < 2 return
            return;
        b = malloc(n*sizeof(int));
        SplitMergeAtoA(a, b, 0, n);
        free(b);
    }
    
    void SplitMergeAtoA(int a[], int b[], size_t ll, size_t ee)
    {
    size_t rr;
        if((ee - ll) == 1)                  // if size == 1 return
            return;
        rr = (ll + ee)>>1;                  // midpoint, start of right half
        SplitMergeAtoB(a, b, ll, rr);
        SplitMergeAtoB(a, b, rr, ee);
        Merge(b, a, ll, rr, ee);            // merge b to a
    }
    
    void SplitMergeAtoB(int a[], int b[], size_t ll, size_t ee)
    {
    size_t rr;
        if((ee - ll) == 1){                 // if size == 1 copy a to b
            b[ll] = a[ll];
            return;
        }
        rr = (ll + ee)>>1;                  // midpoint, start of right half
        SplitMergeAtoA(a, b, ll, rr);
        SplitMergeAtoA(a, b, rr, ee);
        Merge(a, b, ll, rr, ee);            // merge a to b
    }
    
    void Merge(int a[], int b[], size_t ll, size_t rr, size_t ee)
    {
    size_t o = ll;                          // b[]       index
    size_t l = ll;                          // a[] left  index
    size_t r = rr;                          // a[] right index
        while(1){                           // merge data
            if(a[l] <= a[r]){               // if a[l] <= a[r]
                b[o++] = a[l++];            //   copy a[l]
                if(l < rr)                  //   if not end of left run
                    continue;               //     continue (back to while)
                while(r < ee)               //   else copy rest of right run
                    b[o++] = a[r++];
                break;                      //     and return
            } else {                        // else a[l] > a[r]
                b[o++] = a[r++];            //   copy a[r]
                if(r < ee)                  //   if not end of right run
                    continue;               //     continue (back to while)
                while(l < rr)               //   else copy rest of left run
                    b[o++] = a[l++];
                break;                      //     and return
            }
        }
    }
    
    #define N (16*1024*1024)
    
    int main(void){
    int *a = malloc(N * sizeof(int));
    int i;
    clock_t ctTimeStart;            // clock values
    clock_t ctTimeStop;
        for(i = 0; i < N; i++)
            a[i] = rand();
        ctTimeStart = clock();
        MergeSort(a, N);
        ctTimeStop = clock();
        for(i = 1; i < N; i++){     // check for error 
            if(a[i-1] > a[i]){
                printf("sort failed\n");
                break;
            }
        }
        if(i == N){
            printf("sort passed\n");
            printf("# of ticks %d\n", ctTimeStop-ctTimeStart);
        }
        free(a);
        return 0;
    }
    Last edited by rcgldr; 12-31-2017 at 08:14 PM.

  8. #23
    Registered User
    Join Date
    May 2010
    Posts
    4,632
    So, if there is no correlation between them, why should I get rid of it ?
    Because that typedef is trying to hide something that shouldn't be hidden.

    Do you realize that the following are all equivalent?
    Code:
    int someFunction1(int *value);
    int someFunction2(int value[]);
    int someFunction3(int value[10]);
    Also note that the number in that last line is completely ignored by the compiler. You could call that function with an array with the size of one million and the code will still compile.

  9. #24
    Registered User
    Join Date
    Dec 2017
    Posts
    24
    Quote Originally Posted by john.c View Post
    The easiest way to allow the allocation of a large object is to allocate the large object statically.
    Code:
    int main(void) {
        static ARRAY a;
    Dynamic allocation is better, though, since it allows you to let the user choose the size. Get rid of the ARRAY typedef and just accept the array as an int* in functions.
    Code:
    int main(int argc, char **argv) {
        if (argc != 2) {
            fprintf(stderr, "Usage: treesort ARRAYSIZE\n");
            exit(EXIT_FAILURE);
        }
    
        int size = atoi(argv[1]);
        if (size < 1) {
            fprintf(stderr, "Error: ARRAYSIZE must be > 0\n");
            exit(EXIT_FAILURE);
        }
    
        int *a = malloc(size * sizeof *a);
        if (a == NULL) {
            perror("Error: malloc");
            exit(EXIT_FAILURE);
        }
    
        // ...
    
        free(a);
        return 0;
    }
    BTW, using static makes the test program run considerably faster. Insted of having a 200,000 element array sorted in 0,33 secs, it is now sorted in 0,04 with the merge sort ! Is this normal ?

  10. #25
    Registered User
    Join Date
    Dec 2017
    Posts
    24
    Quote Originally Posted by ElPosto View Post
    BTW, using static makes the test program run considerably faster. Insted of having a 200,000 element array sorted in 0,33 secs, it is now sorted in 0,04 with the merge sort ! Is this normal ?
    I've done some further tests to know if the arrays to be sorted were the sames, which could have explained why it's run so fast, but it's now sure they contain different values

  11. #26
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    The allocate for the array should be allocating based on the size of integers, not the size of a pointer to integer (this would make a difference with a 64 bit build, 64 bit pointers, 32 bit integers).

    Code:
    /* ... */
        int *a = malloc(size * sizeof(int));

  12. #27
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    Quote Originally Posted by ElPosto View Post
    I've done some further tests to know if the arrays to be sorted were the sames, which could have explained why it's run so fast, but it's now sure they contain different values
    That seems like an inordinate speedup. I'd have to see both versions of the code.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  13. #28
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by ElPosto View Post
    Instead of having a 200,000 element array sorted in 0.33 secs, it is now sorted in 0.04 with the merge sort ! Is this normal ?
    With the merge sort I posted in my prior post, I get a time of 0.014 seconds for 200,000 element array on my system (Intel 3770K 3.5ghz), so the 0.04 seems reasonable, while the 0.33 seems unusually slow.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Binary Search Tree-search method help?
    By shocklightning in forum C++ Programming
    Replies: 5
    Last Post: 03-25-2012, 10:57 PM
  2. binary search tree
    By angel34 in forum C Programming
    Replies: 0
    Last Post: 12-10-2010, 03:32 PM
  3. Replies: 0
    Last Post: 11-04-2006, 11:07 AM
  4. A Binary Search Tree of... Binary Search Trees...
    By SlyMaelstrom in forum C++ Programming
    Replies: 5
    Last Post: 12-10-2005, 02:12 PM
  5. Search Engine - Binary Search Tree
    By Gecko2099 in forum C Programming
    Replies: 9
    Last Post: 04-17-2005, 02:56 PM

Tags for this Thread