Thread: Shell sort

  1. #1
    Caution: Wet Floor
    Join Date
    May 2006
    Posts
    55

    Shell sort

    Here's a complete implementation taken (verbatim) from Stroustrup (The C++ Programming Language, 3rd ed., p. 158; comments added).

    `cmp' is a comparison function (defined elsewhere).

    How would you streamline this implementation to include a call to
    qsort()?

    Code:
    void ssort(void* base, size_t n, size_t sz, CFT cmp) {
    for(int gap = n/2; 0<gap; gap /= 2)  // gap sequence is n/2**n
           for(int i = gap; i<n; i++)  // start at the second ``row''
               for(int j=i - gap; 0 <= j; j -= gap) {
    
                         char* b = static_cast<char*>(base);
                         char* pj = b +j*sz;
                         char* pjg = b+(j+gap)*sz;
    
                         if(cmp(pjg,pj))<0 {
                                    for(int k=0; k<sz; k++) {
                                                  char temp=pj[k];
                                                  pj[k] = pjg[k];
                                                  pjg[k] = temp;
                                     }
                         }
               }
     }
    See the wikipedia article for details.

    I was thinking about pulling one column at a time from `b' and then sorting that column with a
    call like qsort(column, ncol, sz, cmp), where `ncol' is the number of elements in that column:

    Code:
    12 18 52 9 8 5 17 8 1 5 30 16 46 7 8
    -->
    
    12 6 18 52 9
    8 13 5 17 8
    1 12 5 30 16
    46 7 8
    How would you select, say, only the green elements from the array, sort the green elements, and move on to the next column?

  2. #2
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    I understand exactly what you're asking. You want to try using a different method of doing the h-sorts of each pass.
    Myself and others have actually already tried many variants of this. You can replace the H-insert sorts with H-bubble, or H-brick sorts, and the resulting algorithm is then H-Bubble-Sort, or H-Brick-Sort. You can find these on my web-page (link in sig).
    Yes you can even try replacing the H-sorts with fancier algorithms like quicksort.

    However, I'll let you in on a little tip:
    H-Insertion Sort (as per ShellSort) turns out to be the faster than absolutely anything else, for this purpose.
    The reason is that Insertion Sort is basically the fastest alorgithm for nearly sorted data. The very nature of shell sort, means that you are always working with nearly sorted subarrays.
    If you replace the H-Sort with Quicksort, you'll actually make it very significantly slower! In fact that's with the best case for quicksort too. If you factor in the potential O(n*n) case for quicksort, then it get's beyond ridiculously slower.

    If you actually want to make shellsort faster instead of slower, then you need to stop using the original gap sequence. That gives pretty Big-Oh notation performance. Use the Incerpi-Sedgewick gap sequence (1, 3, 7, 21, 48 ...) for approximate O(n*logn) running time.
    Last edited by iMalc; 12-17-2007 at 11:57 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  3. #3
    Caution: Wet Floor
    Join Date
    May 2006
    Posts
    55
    Thanks!

    I played around with two different gap sequences (Sedgewick, Pratt) , but didn't bother to generate enough randoms to see an appreciable difference in each.

    Code:
    int[] sedge = {1391376, 463792, 198768, 86961, 33936, 13776, 4592,
                        1968, 861, 336, 112, 48, 21, 7, 3, 1};
    Are there other sequences that converge faster to O(n log n)?

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    There are a huge number of different gap sequences. I came across a web-page that had quite a few once. Others include Steven Pigeon (1, 2, 19, 103, 311), Sedgewick-Knuth (1, 4, 13, 40, 121), and a few other Sedgewick ones (1, 8, 23, 77, 281). Actually, here it is: http://www.research.att.com/~njas/se...o.html#sorting

    I'm pretty sure that the Incerpi-Sedgewick one is currently regarded as the best. It is the one I've used in the implementation on my web-page. I may put others up in future.

    You should notice a huge speed difference between that one and Shell's original one where you halve each time. Try timing sorts of even as few as 10000 random numbers.
    Last edited by iMalc; 12-21-2007 at 01:11 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Straight Insertion Sort function problem
    By StaticKyle in forum C++ Programming
    Replies: 6
    Last Post: 05-12-2008, 04:03 AM
  2. Shell Sort with Vectors Problem
    By TheSpoiledMilk in forum C++ Programming
    Replies: 4
    Last Post: 11-22-2005, 03:05 PM
  3. shell sort
    By axon in forum C++ Programming
    Replies: 1
    Last Post: 04-27-2004, 07:18 PM
  4. Shell, Heap and Quick Sort
    By GaPe in forum C Programming
    Replies: 1
    Last Post: 08-19-2003, 01:04 PM
  5. Shell Sort vs Heap Sort vs Quick Sort
    By mackol in forum C Programming
    Replies: 6
    Last Post: 11-22-2002, 08:05 PM