# Thread: Shell sort

1. ## Shell sort

Here's a complete implementation taken (verbatim) from Stroustrup (The C++ Programming Language, 3rd ed., p. 158; comments added).

`cmp' is a comparison function (defined elsewhere).

How would you streamline this implementation to include a call to
qsort()?

Code:
```void ssort(void* base, size_t n, size_t sz, CFT cmp) {
for(int gap = n/2; 0<gap; gap /= 2)  // gap sequence is n/2**n
for(int i = gap; i<n; i++)  // start at the second ``row''
for(int j=i - gap; 0 <= j; j -= gap) {

char* b = static_cast<char*>(base);
char* pj = b +j*sz;
char* pjg = b+(j+gap)*sz;

if(cmp(pjg,pj))<0 {
for(int k=0; k<sz; k++) {
char temp=pj[k];
pj[k] = pjg[k];
pjg[k] = temp;
}
}
}
}```
See the wikipedia article for details.

I was thinking about pulling one column at a time from `b' and then sorting that column with a
call like qsort(column, ncol, sz, cmp), where `ncol' is the number of elements in that column:

Code:
```12 18 52 9 8 5 17 8 1 5 30 16 46 7 8
-->

12 6 18 52 9
8 13 5 17 8
1 12 5 30 16
46 7 8```
How would you select, say, only the green elements from the array, sort the green elements, and move on to the next column?

2. I understand exactly what you're asking. You want to try using a different method of doing the h-sorts of each pass.
Myself and others have actually already tried many variants of this. You can replace the H-insert sorts with H-bubble, or H-brick sorts, and the resulting algorithm is then H-Bubble-Sort, or H-Brick-Sort. You can find these on my web-page (link in sig).
Yes you can even try replacing the H-sorts with fancier algorithms like quicksort.

However, I'll let you in on a little tip:
H-Insertion Sort (as per ShellSort) turns out to be the faster than absolutely anything else, for this purpose.
The reason is that Insertion Sort is basically the fastest alorgithm for nearly sorted data. The very nature of shell sort, means that you are always working with nearly sorted subarrays.
If you replace the H-Sort with Quicksort, you'll actually make it very significantly slower! In fact that's with the best case for quicksort too. If you factor in the potential O(n*n) case for quicksort, then it get's beyond ridiculously slower.

If you actually want to make shellsort faster instead of slower, then you need to stop using the original gap sequence. That gives pretty Big-Oh notation performance. Use the Incerpi-Sedgewick gap sequence (1, 3, 7, 21, 48 ...) for approximate O(n*logn) running time.

3. Thanks!

I played around with two different gap sequences (Sedgewick, Pratt) , but didn't bother to generate enough randoms to see an appreciable difference in each.

Code:
```int[] sedge = {1391376, 463792, 198768, 86961, 33936, 13776, 4592,
1968, 861, 336, 112, 48, 21, 7, 3, 1};```
Are there other sequences that converge faster to O(n log n)?

4. There are a huge number of different gap sequences. I came across a web-page that had quite a few once. Others include Steven Pigeon (1, 2, 19, 103, 311), Sedgewick-Knuth (1, 4, 13, 40, 121), and a few other Sedgewick ones (1, 8, 23, 77, 281). Actually, here it is: http://www.research.att.com/~njas/se...o.html#sorting

I'm pretty sure that the Incerpi-Sedgewick one is currently regarded as the best. It is the one I've used in the implementation on my web-page. I may put others up in future.

You should notice a huge speed difference between that one and Shell's original one where you halve each time. Try timing sorts of even as few as 10000 random numbers.

Popular pages Recent additions