Like Tree6Likes

Quicksort's recursion

This is a discussion on Quicksort's recursion within the C Programming forums, part of the General Programming Boards category; Good afternoon. I understand recursion but I don't know how the last two instructions of quick sort (the recursive ones) ...

  1. #1
    Tears of the stars thames's Avatar
    Join Date
    Oct 2012
    Location
    Rio, Brazil
    Posts
    193

    Question Quicksort's recursion

    Good afternoon. I understand recursion but I don't know how the last two instructions of quick sort (the recursive ones) work.

    Code:
     
    #include <stdio.h>
    #include <time.h> 
    #include <stdlib.h>
    
    #ifndef MAXSIZE
      #define MAXSIZE 10
    #endif  
    
    void swap(int*, int*);
    void quicksort(int*, int);
    
    void swap(int* a, int* b)
    { 
      int temp;
      
      temp = *a; 
      *a = *b; 
      *b = temp;
    } 
    
    void quicksort(int* a, int size) 
    { 
       int pivot, left, right;
       left = 0;
       right = size - 1;
       
       srand(time(NULL));   
       
       if(size < 2) 
         return; 
       pivot = a[rand() % size]; 
       while(left < right) 
       {  
         while(a[left] < pivot)
           left = left + 1;
         while(a[right] > pivot)
           right = right - 1;
         swap(&a[left], &a[right]);    
       }          
       
       /* I don't know how the recursions below work. */
       
       /* ---------------------------------------------------------- */
       quicksort(a, left);
       quicksort(&a[left + 1], size - left - 1);    
       /* ---------------------------------------------------------- */
    
    }
    
    int main(void) 
    { 
        int i;
        int a[MAXSIZE] = {23, 45, 67, 12, 89, 77, 34, 11, 9, 2};
        
        printf("The unordered array: ");
        for(i = 0; i < MAXSIZE; i++)
          printf("%d ", a[i]);  
        putchar('\n');
        
        quicksort(a, MAXSIZE);
        printf("Array after quick sort: ");
        for(i = 0; i < MAXSIZE; i++)
          printf("%d ", a[i]);
        putchar('\n');  
        
        return 0;    
    }

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Before the recursive calls, you have taken an array, and partitioned it into two parts - one with values above the pivot, and one with values below the pivot. In the recursive calls, each of those two sub-arrays will themselves, be partitioned into two sub-arrays, according to the new pivot value.

    The splitting will continue until the number of elements is just two. After that, the sub-array will be sorted, and the return(s) will be made from the recursive calls. The beauty of it is the way that the stack handles the returns so naturally. This looks like magic compared to Quicksort which is coded up without recursive calls.

    srand() should be made once, in main(), before the recursive function.
    Last edited by Adak; 11-23-2012 at 09:59 AM.

  3. #3
    Tears of the stars thames's Avatar
    Join Date
    Oct 2012
    Location
    Rio, Brazil
    Posts
    193
    Many thanks Adak.

  4. #4
    Registered User
    Join Date
    Oct 2011
    Posts
    868
    I'd like to point out a couple of things. First:

    Quote Originally Posted by thames View Post
    Code:
       srand(time(NULL));   
      
       pivot = a[rand() % size];
    The above is wasteful. On every invocation, you'll reinitialize the random number generator, based on the current time in seconds. Because the function is called recursively, it will be called in rapid succession, and the rand() will always return the same value. So, it does not really yield a random pivot.

    The median-of-three pivot selection is found to be just as good in practice. Basically, you take the first (a[0]), middle (a[size/2]), and last value (a[size-1]), sort the three, and use the middle one (the median of the three) as the pivot.

    Second, your loop which moves values less than pivot to the beginning of the array and values greater than the pivot to the end of the array, does not handle pivots correctly. If you supply it with an array containing repeated values, like 1, 5, 1, 1, 5, 2, 2, 4, 4, 4, it will get stuck (at least some of the time).

    You need to decide what you do with the pivot values and handle them accordingly.

    The typical approach is to first swap the pivot with the final element in the array. After you have sorted the left side of the array, you swap the pivot just after the left side, and sort the rest of the array (not including the pivot). To do that, you can either swap a random index with the final element in the array and take the pivot value from the final element, or use the median-of-three pivot selection in a way that puts the middle value at the end of the array. I warmly recommend the median-of-three pivot selection over random pivot selection.
    Last edited by Nominal Animal; 11-23-2012 at 11:04 AM.
    thames likes this.

  5. #5
    Tears of the stars thames's Avatar
    Join Date
    Oct 2012
    Location
    Rio, Brazil
    Posts
    193
    On every invocation, you'll reinitialize the random number generator
    I put srand(time(NULL)) inside main as Adak suggested. I didn't see a median-of-three pivot selection yet. I'd appreciate if you could point to a link which shows the implementation you mentioned (since there are many algorithms. Besides, KnR quicksort's implementation is different from the one I chose).

  6. #6
    Registered User
    Join Date
    Oct 2011
    Posts
    868
    Quote Originally Posted by thames View Post
    I put srand(time(NULL)) inside main as Adak suggested.
    Yes, that is the correct location, when you use a random pivot selection.

    I don't think Adak noticed the issue with repeated pivots: your code sometimes gets stuck, if there are repeated numbers in the array.

    See the Wikipedia article on Quicksort, in particular the pseudocode for the in-place Quicksort, as the paragraph following the pseudocode explains exactly how that is solved by moving the pivot to the end of the array, and swapping it back just before sorting the right side.

    Quote Originally Posted by thames View Post
    I didn't see a median-of-three pivot selection yet.
    I'm sure you'll find a reference to it in your book. If not, then just search for it.

    It really is as simple as it sounds: you pick the median (middle value) of the first, middle, and last values of the array, as your pivot.

    Like I said, you should do it in a way that puts the middle value of the three at the end of the array, and uses that value as the pivot.

    If a[0] is between a[middle] and a[right], then a[0] is the median, and you swap a[0] and a[right].
    If a[middle] is between a[0] and a[right], then a[middle] is the median, and you swap a[middle] and a[right].
    Otherwise a[right] is the median of the three, and you let it be where it is:
    Code:
        left = size / 2;
    
        /* Swap the median of a[0], a[left], a[right], with a[right]. */
        if ((a[0] <= a[left] && a[0] >= a[right]) ||
            (a[0] >= a[left] && a[0] <= a[right]))
            swap(&a[0], &a[right]); /* a[0] was the median */
        else
        if ((a[left] <= a[0] && a[left] >= a[right]) ||
            (a[left] >= a[0] && a[left] <= a[right]))
            swap(&a[left], &a[right]); /* a[left] was the median */
        /* else no swap needed, since a[right] is the median. */
    
        pivot = a[right];
        left = 0;
    In your partitioning loop, you need to accept pivots on the right side, or the while loop will get stuck, if it encounters a pivot on the left side. (Your current code will just swap the pivot between a[left] and a[right], ad infinitum, without increasing left or decreasing right. That's how it gets stuck.)

    Your second inner while clause will have to be (left < right && a[right] >= pivot) , since you cannot assume left side does not contain pivots. Unless you verify (left < right), you might fall out from the left side, if the first value on the left side matches the pivot or there are no values smaller than the pivot in the array!

    Before recursing into the right side, you need to swap the pivot back, i.e. swap(&a[left], &a[size-1]); . Your current recursive calls are correct, since that will move the pivot back to its correct position between the two sides, and the latter call sorts the rest of the array, i.e. elements a[left+1] to a[size-1], inclusive.

    I did verify your code by applying the above changes, then running a large set of random arrays, verifying the code produces the correct, expected results. (Like I said, your current code sometimes gets stuck if the array contains repeated values.) I'm sure you can modify your code accordingly; it should not be too difficult.
    thames likes this.

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I don't move the pivot out of the way with Quicksort - although that is the classic way of handling it.

    Sorting program with three rounds of sorting: random, ascending, and descending integers, each round is sorted by Bubble sort, Insertion sort, Quicksort, Optimized Quicksort, and Shell sort.

    In the second round, notice how Bubble and Insertion sort, ROCK!

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    #define INSERT 64
    #define SIZE   100000
    
    void bubbleSort(void);
    int checkIt(void);
    void insertionSort(int lo, int hi); 
    void quicksortOpt(int lo, int hi, int optimal);
    void fillArray(void);
    void shellSort(void);
    void swap(int i, int j);   
    
    int A[SIZE]={0};
    unsigned long long int SWAPS=0;
    int TYPE=0; 
    
    int main(void) {
       int i,j;
       char type[3][11]={{"random"},{"ascending"},{"descending"}};
       
       clock_t start,stop;
    
       for(j=0;j<3;j++) {
          TYPE=j;
          fillArray();
          printf("\nSorting %d integers, initially in %s order. The first 10 numbers:\n\n",SIZE,type[TYPE]);
          for(i=0;i<10;i++)  printf("%6d  ",A[i]); 
          printf("\n");
       
          SWAPS=0;
          start=clock();
          bubbleSort();
          stop=clock();
          printf("        Bubble sort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
          if(checkIt());
    
          SWAPS=0;
          fillArray();
          start=clock();
          insertionSort(0,SIZE);
          stop=clock();
          printf("     Insertion sort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
          if(checkIt());
    
          SWAPS=0;
          fillArray();
          start=clock();
          quicksortOpt(0,SIZE-1, 0);
          stop=clock();
          printf("          Quicksort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
          if(checkIt());
    
          SWAPS=0;
          fillArray();
          start=clock();
          quicksortOpt(0,SIZE-1, 1);
          stop=clock();
          printf("Optimized Quicksort: %15llu swaps in %10.4f seconds\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
          if(checkIt());
    
          SWAPS=0;
          fillArray();
          start=clock(); 
          shellSort();
          stop=clock();
          printf("         Shell sort: %15llu swaps in %10.4f seconds\n\n",SWAPS,(double)(stop-start)/CLOCKS_PER_SEC);
          if(checkIt()); 
       } 
    
       for(i=0;i<10;i++)
          printf("%6d  ",A[i]);
       
       
       printf("\n");
       return 0;
    }
    //Standard and Optimized Quicksort
    void quicksortOpt(int lo, int hi, int optimal) {
       int i, j, pivot;
       if(lo == hi) return; 
    
      if(optimal && (hi - lo) < INSERT) {  //The optimizer. Value of
        insertionSort(lo, hi+1);           //INSERT will vary 
        return;
      }
    
       i=lo; 
       j=hi;
       pivot= A[(lo+hi)/2]; 
    
      /* Split the array into two parts */
       do {    
          while (A[i] < pivot) i++; 
          while (A[j] > pivot) j--;
          if (i<=j) {  
             swap(i, j);   //this line works fine.
             SWAPS++;
             i++;
             j--;
          }
       } while(i<=j);
      
       if (lo < j) quicksortOpt(lo, j,optimal);
       if (i < hi) quicksortOpt(i, hi,optimal);
    }
    void bubbleSort() {   
       int i, n, swapped;
       n = SIZE-1;
       do {
          swapped = 0;
          for(i =0; i < n; i++) {  
             if(A[i] > A[i + 1 ]) {
                swap(i, i + 1);
                SWAPS++; 
                swapped = 1;
             }
          }
          --n;
       }while(swapped);
    }
    void insertionSort(int lo, int hi) {
       int i, j, val; 
       for(i=lo+1;i<hi;i++) {  
          val = A[i];
          j = i-1;
          while(A[j]>val) {
             A[j + 1] = A[j];
             ++SWAPS;
             --j;
             if(j<0) 
                break;
          }   
          A[j+1] = val;
       }
    }
    //Shell sort
    void shellSort(void) {
      int i, j, gap, temp;
      for(gap = SIZE/2; gap > 0; gap /= 2) {
        for(i = gap; i < SIZE; i++) {
          for(j = i - gap; j >= 0 && A[j] > A[j + gap]; j -= gap) {
    	      ++SWAPS;
             temp = A[j];
    	      A[j] = A[j + gap];
            A[j + gap] = temp;
          }
        }
      }
    }
    void swap(int i, int j) {    
       int temp;
       temp = A[i];
       A[i] = A[j];
       A[j] = temp;
    }
    void fillArray(void) {
       int i,j;
       if(TYPE==0) {         //random order
          for(i = 0; i < SIZE; i++)
             A[i] = (rand() % 10000)+1;
       }else if(TYPE==1) {   //ascending order
          for(i=0;i<SIZE;i++)
              A[i]=i;
       }else if(TYPE==2) {   //descending order
          for(j=0,i=SIZE-1;i>=0;i--,j++)
             A[j]=i;
       }
    }
    int checkIt(void) {
       int i;
       for(i=0;i<SIZE-1;i++) {
          if(A[i]>A[i+1]) {
             printf("Error! Not sorted in ascending order.\n");
             return 1;
          }
       }
       return 0;
    }
    thames likes this.

  8. #8
    Registered User
    Join Date
    Oct 2011
    Posts
    868
    Quote Originally Posted by Adak View Post
    I don't move the pivot out of the way with Quicksort
    Oh no, that's not what I meant. I meant that in post #2, you did not notice that thames' code will get stuck (never complete) if the pivot occurs more than once in the array.

    Quote Originally Posted by Adak View Post
    Sorting program with three rounds of sorting: random, ascending, and descending integers, each round is sorted by Bubble sort, Insertion sort, Quicksort, Optimized Quicksort, and Shell sort.
    No selection sort?

    I prefer to use selection sort (instead of insertion sort) for small lists (< 32 elements, typically, if sorting ints), especially with quicksort.

    For quicksort, in cases when I know there is a lot of repeated values, I like to delete the pivots, then insert them back between the sorted arrays afterwards. It seems to give a speed boost (due to reduced recursion) when there are relatively many pivot values, but the added code complexity may make it a bit slower than a typical optimized quicksort.

    Hey, how about we start a thread comparing sort function implementations? Perhaps in the contest board? Combining some of our approaches we might arrive at something even better? (Yeah, I'm bored.)

  9. #9
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Quote Originally Posted by Nominal Animal View Post
    Oh no, that's not what I meant. I meant that in post #2, you did not notice that thames' code will get stuck (never complete) if the pivot occurs more than once in the array.
    I noticed the i < j, instead of i <= j, part. That always gets my version of Quicksort, into a snit - but then I thought "if he was getting stuck in an infinite loop, he'd mention it". When he's ready.

    I didn't run his program, however.

    No selection sort?

    I prefer to use selection sort (instead of insertion sort) for small lists (< 32 elements, typically, if sorting ints), especially with quicksort.
    I've always found Selection sort to be the simplest sorting algorithm, (inuitively, it gets a 10), but unfortunately, it's always been the slowest one also. I've never took a liking to it, for that reason. Aside from the low number of swaps it makes, I don't know anything to recommend it.

    On small (say less than 64) number of items, especially if they're partly sorted, you can't beat Insertion sort, AFAIK. That's why it works so well with the small sub-arrays from recursive Quicksort.

    For quicksort, in cases when I know there is a lot of repeated values, I like to delete the pivots, then insert them back between the sorted arrays afterwards. It seems to give a speed boost (due to reduced recursion) when there are relatively many pivot values, but the added code complexity may make it a bit slower than a typical optimized quicksort.
    Hey, how about we start a thread comparing sort function implementations? Perhaps in the contest board? Combining some of our approaches we might arrive at something even better? (Yeah, I'm bored.)
    My plates piled high right now. I've got a decade old chess game I want to dig into (if I can find it), a Milestone program for my folding@home team, to revise, and my latest idea is to see if I can gather up a frequency word list of British GI's messages in WWII, and possibly decode a pigeon's message that was just found on the dead pigeon's leg, in a chimney in England. (a far fetched idea, I'll grant)

    Can't help your boredom, I'm afraid. There's always Project Euler, SPOJ, and Code Chef.

    And learning Go - an interesting language, imo.
    thames likes this.

  10. #10
    Tears of the stars thames's Avatar
    Join Date
    Oct 2012
    Location
    Rio, Brazil
    Posts
    193
    I tried to follow Wikipedia's in-place version however my code isn't working:

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    void swap(int*, int*);
    void quicksort(int*, int, int);
    int partition(int*, int, int, int);
    int getMedian(int* a);
    
    int partition(int* a, int left, int right, int pivotIndex)
    { 
       int pivotValue, storeIndex, i;
       
       pivotValue = a[pivotIndex];
       swap(&a[pivotIndex], &a[right]);    
       storeIndex = left;
       for(i = left; i < right - 1; i++)
       {  
         if(a[i] < pivotValue)
         {  
           swap(&a[i], &a[storeIndex]);
           storeIndex = storeIndex + 1;
         } 
       } 
      
      swap(&a[storeIndex], &a[right]);      
      return storeIndex;
    }     
    
    int getMedian(int* a) 
    { 
       int i, nElems; 
       nElems = 0;
       
       for(i = 0; a[i] != '\0'; i++) 
         nElems++;
       
       return (nElems % 2 == 0) ? (nElems + 2) / 2 :  (nElems + 1) / 2;
    }     
    
    
    void quicksort(int *a, int left, int right) 
    { 
       int pivotIndex;
       if(left < right) 
       { 
          pivotIndex = getMedian(a);
          pivotIndex = partition(a, left, right, pivotIndex);
          quicksort(a, left, pivotIndex - 1);
          quicksort(a, pivotIndex + 1, right);       
        }            
    }     
    
    void swap(int* a, int* b)
    { 
      int temp;
      
      temp = *a; 
      *a = *b; 
      *b = temp;
    } 
    
    #ifndef MAXSIZE
      #define MAXSIZE 10
    #endif
    
    int main(void) 
    { 
      int i;
      int a[MAXSIZE] = {13, 13, 2, 2, 6, 7, 9, 8, 25, 11};
      
      printf("Unordered elements: ");
      for(i = 0; i < MAXSIZE; i++)       
        printf("%d ", a[i]);
      putchar('\n');
      
      quicksort(a, a[0], a[MAXSIZE - 1]);
      printf("Array after quicksort: ");
      for(i = 0; i < MAXSIZE; i++)       
        printf("%d ", a[i]);
      putchar('\n');
      
      return 0;    
    }

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Whenever I think of Quicksort, I picture a hundred kids out on the school yard, waiting to be sorted by height.

    Two kids in red sweaters are standing by at each end of the long straight line. A teacher puts a gold colored hat on a kid somewhere near the middle of the line, and the two kids in red sweaters, start walking down the line toward each other, comparing each kid's height, to the height of the pivot kid with the golden cap.

    When each of the red sweater kids have found someone, they yell "swap", and the two kids they're next to, go running to the other's position in the line. That continues until the two red sweater kids, meet.

    Now the entire line left of the meeting spot of the kids, takes a big step forward, and the red sweater kids run to get back into their places at the end of the new "line" of kids. Meanwhile, a new pivot student is chosen, and given the golden cap to wear. Now the partitioning continues anew, with this line.

    Finally, the new line consists of just one or two students in the line, and they are partitioned by height, as well. Now this short line takes one big step backward, and rejoins the next to last line. Right back into the right place. (the beauty of the stack). This step back is repeated several times, until the entire student line has been reassembled.

    But now it's all sorted.

    First, everybody uses Quicksort for in-place sorting. I've never heard it even mentioned as an external/auxiliary memory sorting algorithm. So that first algorithm in Wiki is "on the right ladder, but learning against the wrong house".

    Second, you don't want to count all the values in the array, before you choose a pivot - you'll be slowing it down a lot on big arrays, because every sub-array has to have a pivot selected, not just the first array. And Quicksort is used for big arrays - you can use Shell or Insertion sort for smaller sorting jobs.

    Third, the pseudo code they go into, is useful to help illustrate their discussion, not to make a coherent practical program. There is no reason to make Quicksort into 3 functions. If you're going to make it so onerous to use it, then very few people would bother with it. That's the problem with FLASH sort and Comb sort - they work fine, but they're way to long to be practical - and can't beat Quicksort, anyway.

    That's why I like that version I posted so much - it's clear - I can see those kids out on the school yard, being sorted, and it's fast. There are faster versions, (not many), but those that are faster, begin to lose their clarity, imo.

    Look over the version I previously posted, and see if you can't picture the kids out on the school yard, waiting to be sorted by height.

    I believe this is from K&R. It's shorter and easier to understand, as well.

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    /* from: The Practice of Programming (pp: 32-34)
     *   by: Brian W. Kernighan and Rob Pike
     */
    /* swap: interchange v[i] and v[j] */
    void swap( int v[], int i, int j )
    {
      int tmp;
      tmp = v[i];
      v[i] = v[j];
      v[j] = tmp;
    }
    /* quicksort: sort v[0]..v[n-1] into increasing
     * order
     */
    void quicksort( int v[], int n )
    {
      int i = 0, last = 0;
      if ( n <= 1 )              /* nothing to do */
        return;
      swap( v, 0, rand() % n );  /* move pivot elem
                                    to v[0] */
      for ( i = 1; i < n; i++ )  /* partition */
        if ( v[i] < v[0] )
          swap( v, ++last, i );
      swap( v, 0, last );        /* restore pivot */
      quicksort( v, last );      /* sort smaller
                                    values */
      quicksort( v+last+1, n-last-1 );  /* sort larger
                                           values */
    }
    void print_array( const int array[], int elems )
    {
      int i;
      printf("{ ");
      for ( i = 0; i < elems; i++ )
        printf( "%d ", array[i] );
      printf("}\n");
    }
    #define NUM 9
    int main( void )
    {
      int arr[NUM] = { 6, 12, 4, 18, 3,
                      27, 16, 15, 19 };
      /* commented out to aid in predictability
       * srand( (unsigned int) time(NULL) ); */
      print_array(arr, NUM);
      quicksort(arr, NUM);
      print_array(arr, NUM);
      return EXIT_SUCCESS;
    }

  12. #12
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,308
    Quote Originally Posted by Nominal Animal View Post
    No selection sort?

    I prefer to use selection sort (instead of insertion sort) for small lists (< 32 elements, typically, if sorting ints), especially with quicksort.
    I tend to stay away from selection sort (for small N) for two reasons.
    1. It does a fixed number of comparisons and this makes it end up typically slower than say insertion sort where the number of comparisons depends on the ordering.
    2. It is not a stable sort, so it isn't suitable for producing the initial runs of e.g. a stable merge sort.

    For quicksort, in cases when I know there is a lot of repeated values, I like to delete the pivots, then insert them back between the sorted arrays afterwards. It seems to give a speed boost (due to reduced recursion) when there are relatively many pivot values, but the added code complexity may make it a bit slower than a typical optimized quicksort.
    As soon as you delete values you lose the ability to sort items which are equivalent for the purposes of comparison, but not equal in terms of every other measure.
    One can move all items equivalent to the pivot to one end, which is called "three way quicksort".
    The thing I tend to do is to check which portion is smaller, then recurse on that one and just iterate on the other. It is also the only way to implement quicksort that guarantees that you avoid stack overflow, and that includes implementations that make use of an explicit stack.

    Hey, how about we start a thread comparing sort function implementations? Perhaps in the contest board? Combining some of our approaches we might arrive at something even better? (Yeah, I'm bored.)
    That would work for me. You could also check out the sorting pages of my website if you haven't already.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  13. #13
    Registered User
    Join Date
    Oct 2011
    Posts
    868
    Quote Originally Posted by iMalc View Post
    I tend to stay away from selection sort (for small N)
    Good points (and by Adak, as well). I think I'll switch to insertion sort, too.

    Like I said earlier, the data I sort tends to be large enough for radix sorting, so I've never paid much attention to sorting small arrays.

    For lists -- my structures tend to be chained --, I've used Mergesort. I've played with combining the first pass with detecting sorted sequences (merged into one sorted sequence if more than one consecutive sequence is detected), with a three-way merge at the end. It's very simple to implement, and theoretically the time spent in the extra tests should be masked completely because that first pass is limited by memory bandwidth. I've never bothered to compare it, or even write it up.

    Quote Originally Posted by iMalc View Post
    The thing I tend to do is to check which portion is smaller, then recurse on that one and just iterate on the other.
    That sounds like a good approach: smaller list size should mean fewer iterations.

    Quote Originally Posted by iMalc View Post
    That would work for me. You could also check out the sorting pages of my website if you haven't already.
    I have, and just did again, but you use C++ and I prefer C.

    Fortunately, the way you avoided the standard C++ library features means there are very few C++-specific details (temporary arrays; I didn't really notice anything else). You also use a Windows environment, and I'm POSIX/Linux, so we'd have to first write some kind of testing harness that'd compile and run on both OSes.

    The numeric types I'm most interested in are the standard ones: int32_t, uint32_t, int64_t, uint64_t, float, and double (finite normalize values usig the IEEE-754 Binary32 and Binary64 formats, respectively).

    A Xorshift generator is fast, portable, and easy to seed (from say command-line parameter), and very suitable for generating random input data. We'd probably also need a way to load specific test cases from test files (standard input).

    For lists, it would make sense defining a node that contains only the next pointer and a value (of one of the above types), with a variable amount of padding per node.

    In Linux, I've used the dynamic link facilities (dlopen() etc.) to ensure I can play with the compiler optimizations for the tested functions without affecting the timing facilities (and I've used the TSC since my processors all have constant_tsc). If we compile the sort functions in separate units, and take care to not allow link time optimization, we should achieve the same portably. After all, timing-wise, we'll want to compare the functions, not the machines or OSes; only the relative differences of the sort functions run on a single machine are meaningful. (And, of course, also the differences in relative differences on different CPUs.)

    I'd rather have this be more like open collaboration between any interested members than a question or a competition (I don't need the stress of any sort of competition!), but there does not seem to be a suitable forum here.

    Moderators? Where to start such a discussion?

  14. #14
    Tears of the stars thames's Avatar
    Join Date
    Oct 2012
    Location
    Rio, Brazil
    Posts
    193

    Question

    I was studying the function checkIt and I understood it's evaluating if the array was already sorted. However what are the situations in which the array wouldn't be sorted?

    edit:

    is selection sort too slow for small arrays? and bubble sort ?
    Last edited by thames; 11-26-2012 at 02:50 PM.

  15. #15
    Registered User
    Join Date
    Oct 2011
    Posts
    868
    Quote Originally Posted by thames View Post
    However what are the situations in which the array wouldn't be sorted?
    Programming errors?

    Verifying the functions work is a simple form of unit testing. When I fiddle with sort functions, I write a test harness: a main() that generates test cases, sorts them, usually also timing the execution, and verifies that the data is in sorted order.

    I usually start with a test sorter, something like
    Code:
    static int compare_foo(const void *const ptr1, const void *const ptr2)
    {
        const foo val1 = *(const foo *)ptr1;
        const foo val2 = *(const foo *)ptr2;
    
        if (val1 < val2)
            return -1;
        else
        if (val1 > val2)
            return +1;
        else
            return  0;
    }
    
    void sort_foo(foo *const data, const size_t size)
    {
        qsort(data, size, sizeof (foo), compare_foo);
    }
    which I use to make sure the test harness works correctly, foo being the numeric data type I'm interested in at that point. (Just omit the qsort call to simulate a failing sort.)

    That way you don't waste time looking for bugs all over your code, but can concentrate on doing each one thing well, and then immediately test.

    Quote Originally Posted by thames View Post
    is selection sort too slow for small arrays? and bubble sort ?
    No, they are not.

    It is true that insertion sort is more efficient than selection sort, and selection sort is faster than bubble sort. The links are to Wikipedia articles, with pseudocode example implementations.

    Bubble sort has the very nice feature that it only swaps adjacent entries. It also does no extra work if the array is already sorted.

    Selection sort is easy to program correctly, and uses at most as many swaps as there are elements in the array. It does do a large number of comparisons, however.

    Insertion sort does fewer comparisons, and instead of swaps, moves parts of the array to be able to insert the value in the correct position.

    In other words, the sort algorithms have different features. I personally think that alone makes them all worth learning, because it shows how you can use different approaches to crack a nut. That there is not just one vendor-provided nutcracker you need to use for everything, but have an entire toolbox to choose from:

    Pick your tools wisely, with consideration; not with haste and conviction.

    For the optimized quicksort case, where a faster-for-smaller-arrays sort function is called when the part to sort is small, insertion sort does beat selection and bubble sort. Because quicksort is designed for large arrays, a single quicksort() invocation will involve a lot of those calls. So, picking one that is efficient means a more efficient quicksort() overall.

    Examples:

    If you have a small array of relatively large structures, both bubble sort and insertion sort would mean you'd need to swap or move the data in those structures a lot. The amount of memory copied might be the bottleneck here. And you might have other restrictions why you don't want to modify the data structure or organization. With selection sort, you know you need very few (at most the number of elements in the array). So, you could use selection sort for that case.

    If you have a small array of data you know is almost surely sorted, you could choose between insertion sort and bubble sort. Bubble sort uses swaps, whereas insertion sort uses insertion/memory moves. Bubble sort is slower, but since the array is small, the execution times you're comparing are "very short" and "even shorter": usually lost in the noise.

    Of course, a good programmer will eventually want their solution to be robust, and avoid making unconfirmed assumptions. So, instead of assuming there are just a few elements of data, you'll eventually use a sort function that checks, and uses the appropriate algorithm for data. Even then, the most important thing, in my opinion, is to document the logic, the intentions you have as a programmer, in the source code, in descriptive comments: not what the code does, but why.

    That way, when the purpose of the code changes, you or the future developer can check the code, see the assumptions made and their limitations, and adjust the code accordingly. Even things like cache architecture developments may change the balance of which algorithm turns out most efficient in practice, so knowing and documenting those reasons, is very useful in long term.
    thames likes this.

Page 1 of 3 123 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Quicksort help!
    By zeronero in forum C Programming
    Replies: 16
    Last Post: 02-18-2012, 01:06 AM
  2. Help with quicksort
    By c++prog in forum C++ Programming
    Replies: 12
    Last Post: 11-24-2009, 11:01 PM
  3. quicksort
    By WelshGrandSlam in forum C++ Programming
    Replies: 8
    Last Post: 01-23-2008, 08:15 PM
  4. Quicksort
    By |deep| in forum C++ Programming
    Replies: 1
    Last Post: 07-21-2002, 08:51 AM
  5. quicksort
    By Unregistered in forum C++ Programming
    Replies: 2
    Last Post: 01-12-2002, 02:07 AM

Tags for this Thread


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21