# Thread: insertion sort vs shell sort

1. ## insertion sort vs shell sort

I know the difference between insertion sort and shell sort. Insertion sort compares every single item with all the rest elements of the list, whereas shell sort uses a gap to compare items far apart from each other, and then recursively closes in the gap until all elements are sorted. Obviously, shell sort is more effective when the array items are shuffled far apart from their original order. However, when we have an array which is almost sorted, it seems the insertion sort algorithm will be more efficient. So if I want to use insertion sort or shell sort dependent on the situation, do I need to implement my own insertion sort and shell sort like this:
Code:
```void shellsort(int v[], int n)
{
int gap, i, j, temp;

for (gap = n/2; gap > 0; gap /= 2)

for (i = gap; i < n; i++)

for (j=i-gap; j>=0 && v[j]>v[j+gap]; j-=gap) {

temp = v[j];

v[j] = v[j+gap];

v[j+gap] = temp;
}
}

static void insertion_sort(int *a, const size_t n) {
size_t i, j;

int value;

for (i = 1; i < n; i++) {

value = a[i];

for (j = i; j > 0 && value < a[j - 1]; j--) {

a[j] = a[j - 1];

}

a[j] = value;
}
}```
Or does c already have these functions independent from each other in stdio or stdlib that I can use at will? 2. Originally Posted by johnmerlino I know the difference between insertion sort and shell sort. ...Or does c already have these functions independent from each other in stdio or stdlib that I can use at will?
You are probably looking for this, it is the generic sort supplied by the standard library: qsort(3): sort array - Linux man page

It is not specified what algorithm it uses, so in theory it could use insertion sort, but I doubt it. It probably uses a randomized quicksort for large enough arrays. 3. However, when we have an array which is almost sorted, it seems the insertion sort algorithm will be more efficient.
O_o

If they array is small, Bubble Sort or Selection Sort may be faster.

If you want to optimize a sort for specific scenarios, you'll need more information than "almost sorted".

Soma 4. Originally Posted by johnmerlino an array which is almost sorted
If the array is large enough, then a "natural" bottom up merge sort would be fast, but it uses a second array for temporary storage, and could require a final copy array step if the the result has to be in the original array (as opposed to the second array). 5. If the array is large enough, then a "natural" bottom up merge sort would be fast, but it uses a second array for temporary storage, and could require a final copy array step if the the result has to be in the original array.
O_o

Another good example of "almost sorted" being insufficient information.

Soma 6. Originally Posted by rcgldr If the array is large enough, then a "natural" bottom up merge sort would be fast, but it uses a second array for temporary storage, and could require a final copy array step if the the result has to be in the original array (as opposed to the second array). Originally Posted by phantomotap Another good example of "almost sorted" being insufficient information.
If a stable sort (preserving the order of "equal" elements) is not required, then a natural bottom up merge sort can rely on compares to determine group boundaries, which has the side effect of merging groups that just happen to end up in order either in the initial state or during the sort, taking advantage of any significant amount of preordering in an array. 7. Originally Posted by rcgldr If the array is large enough, then a "natural" bottom up merge sort would be fast, but it uses a second array for temporary storage, and could require a final copy array step if the the result has to be in the original array (as opposed to the second array).
Actually, johnmerlino is right. Insertion sort is the fastest algorithm for an almost sorted array (depending on your exact definition of "almost").

It doesn't even depend on how big or small the array is either. If the array is actually sorted, then for a good implementation the amount of work it does is equal to the amount of work required to determine if the array is sorted. Not just in terms of big-oh notation either, but in terms of actual CPU instructions executed.

Some other useful tips:
1. Shell Sort, as per how Donald Shell wrote it (with gap sizes that halve each time), is actually a very poor choice of gaps. The Incerpj-Sedgewick gap-table variation is far superior.

2. You can simply write your insertion sort such that instead of using ++, --, -1, +1, or 1, that you instead use += k, -= k, -k, +k, and k, where k is another parameter. Then if you call it with k==1 you get a regular insertion sort, or if you call, it from within a Shell Sort skeleton function, then you get a Shell Sort. I.e. the same code does both.

3. Shell Sort is, as you know, doing insertions, i.e. rather than inserting via repeated swapping, just do it via insertion as you would with Insertion Sort.

Since I'm feeling generous, here's a version of some of the C++ code I have, converted to C, to illustrate the idea:
Code:
```void ShellSortAux(int a[], int n, int k) {
for (int i = k; i < n; ++i) {
int j = i - k;
if (a[i] < a[j]) {
int tempItem = a[i];
do {
a[j+k] = a[j];
j-=k;
} while ((j >= 0) && (tempItem < a[j]));
a[j+k] = tempItem;
}
}
}

//Donald Shell 1959
void OriginalShellSort(int a[], int n) {
for (int m = n/2; m > 0; m/=2)
ShellSortAux(a, n, m);
}

//Incerpi-Sedgewick 1985
void ShellSortIncerpiSedgewick(int a[], int n) {
const int incs[] = {
1, 3, 7, 21, 48, 112,
336, 861, 1968, 4592, 13776,
33936, 86961, 198768, 463792, 1391376,
3402672, 8382192, 21479367, 49095696, 114556624,
343669872, 52913488, 2085837936};
for (int l = sizeof(incs)/sizeof(incs); l > 0;)
ShellSortAux(a, n, incs[--l]);
}```
You'll find this a lot quicker than what you have, and as a result, you may find you don't care so much about detecting the almost-sorted case any more. 8. It doesn't even depend on how big or small the array is either.
O_o

Actually, the big/small condition is important, but not actually in terms of elements so much as size in bytes. If the size of the array fits entirely in cache and the "collation error" is sufficiently small Bubble Sort may be faster than Insertion Sort, Shell Sort, and similar. (For example, I use Bubble Sort for error recovery in packed transactions when configured under the assumption of a secure network. The odds of a "collation error" greater than one marks it as practically impossible, and the transaction fits in cache by definition.) If those conditions aren't met, reach for a different algorithm.

To be clear, I have indeed performed several profiling passes over a multitude of loads, and I have also--nearly to the point of exhaustion--tested the situation by thoroughly debugging the underlying code.

O_o

Which brings to mind: If you are honestly considering implementing multiple sorts for the same situation; do yourself a favor and don't guess. If you are compelled to guess such as from not knowing the situation, use hybrids that combine/decay different situations as a naturally consequence of the algorithms.

Soma 9. Originally Posted by phantomotap O_o

Actually, the big/small condition is important, but not actually in terms of elements so much as size in bytes. If the size of the array fits entirely in cache and the "collation error" is sufficiently small Bubble Sort may be faster than Insertion Sort, Shell Sort, and similar. (For example, I use Bubble Sort for error recovery in packed transactions when configured under the assumption of a secure network. The odds of a "collation error" greater than one marks it as practically impossible, and the transaction fits in cache by definition.) If those conditions aren't met, reach for a different algorithm.
That only holds true for the general case of more-or-less randomly ordered items. It is well known that Insertion Sort is the fastest for almost sorted data. There is plenty of published information out there that confirms this, and indeed is likely how the OP came to know this.
Certainly a very valid point you likely have in mind, is that you most often don't know the initial ordering of the data, and as such it isn't often useful in practice to try and take advantage of this. Checking the initial ordering first, significantly worsens the average case. When you don't know the ordering, Bubble Sort certainly may be better in the average case. That point may well me the most important thing that needs to be conveyed in this thread, but it does not make my statements false.

To be perfectly clear, no matter the size of the array, when the array is exactly sorted, an optimised Bubble Sort, with either a done flag, or a float-limit position, does the exact same amount of work as an Insertion Sort containing the "do I even need to move this item at all" check (as I have included above).
Both do N-1 item comparisons between the exact same pairs of items (so identical caching behaviour), and both do no item copies/swaps/moves. That is the theory side of things. Upon actual execution, any time I've tried them in the past, there was negligible difference in such a case, if any, for very large arrays.

The difference in execution times comes from what order the array happened to be in to begin with. Having say, just the first few items misplaced at the other end of the array means just a few iterations of the inner loop are required and these loops perform range-moves, not repeated item swaps as in Bubble Sort. That does make a difference when it comes to large numbers of items.
On the other hand, if the arrays started out with every adjacent pair of items switched, then Bubble Sort would no doubt win.
As far as actual execution time is concerned, it really does come down to the initial ordering, and how you define "almost" sorted. 10. Certainly a very valid point you likely have in mind, is that you most often don't know the initial ordering of the data, and as such it isn't often useful in practice to try and take advantage of this.
O_o

Does my point make your statements false? No. Your statements are conditioned upon the situation exactly as the relative performance of algorithms is conditioned upon the situation. Whether though you like it or not, the big/small condition is important with such "finely grained tuning" because you may indeed have one of the rare situations where Bubble Sort may have been faster but actually still turns out slower because the number of swaps--which is related to the number of elements between source and destination--is still too high. You've even noted the fact that the cost of repeated swaps grows as with the distance of misplaced items which may only grow as large as the number of elements.

With every comparison of every sorting algorithm, you may choose to look at any one aspect of the situation. With such "finely grained tuning" as I believe to be implied, the one aspect is simply never enough because some other aspect may easily trump your expectations. Look at all the conditional language you've used. You are literally making my point over and over with such language. A choice example: "As far as actual execution time is concerned, it really does come down to the initial ordering, and how you define "almost" sorted.". Yes. The actual execution time is dependent on those aspects of the situation, and despite all of your references to those two situations, those are only two of many aspects of a given situation calling to collate some sort of data.

My point is that such "finely grained tuning" requires far more knowledge about the situation than simply "almost sorted" such as what it means for the data to be "sorted", the number of elements, the cost of comparisons, the cost of copies, the available memory, and a lot of other information. Even if you do know "the initial ordering of the data" sufficiently to make decisions relative to that criteria, that information alone may not be sufficient to gain any real, stable performance.

Your interpretation of my point really isn't far removed from my actual point. I am indeed speaking to the practicality of "finely grained tuning" based on situational expectations. I'm only going a lot further by saying that the practicality of "finely grained tuning" is largely negated by the abundance of different criteria rather than the conditions of any specific criteria.

Soma 11. Originally Posted by iMalc when the array is exactly sorted ... Both do N-1 item comparisons
The same is true of a natural bottom up merge sort. With a merge sort that uses compare to determine the end of sorted runs, if a significant amount of data is approximately in the correct location within runs as opposed to actual ascending (or descending) sequences, then the number of merge passes will be reduced as runs or groups will get automatically concatenated. 12. Originally Posted by phantomotap O_o

Does my point make your statements false? No. Your statements are conditioned upon the situation exactly as the relative performance of algorithms is conditioned upon the situation. Whether though you like it or not, the big/small condition is important with such "finely grained tuning" because you may indeed have one of the rare situations where Bubble Sort may have been faster but actually still turns out slower because the number of swaps--which is related to the number of elements between source and destination--is still too high. You've even noted the fact that the cost of repeated swaps grows as with the distance of misplaced items which may only grow as large as the number of elements.

With every comparison of every sorting algorithm, you may choose to look at any one aspect of the situation. With such "finely grained tuning" as I believe to be implied, the one aspect is simply never enough because some other aspect may easily trump your expectations. Look at all the conditional language you've used. You are literally making my point over and over with such language. A choice example: "As far as actual execution time is concerned, it really does come down to the initial ordering, and how you define "almost" sorted.". Yes. The actual execution time is dependent on those aspects of the situation, and despite all of your references to those two situations, those are only two of many aspects of a given situation calling to collate some sort of data.

My point is that such "finely grained tuning" requires far more knowledge about the situation than simply "almost sorted" such as what it means for the data to be "sorted", the number of elements, the cost of comparisons, the cost of copies, the available memory, and a lot of other information. Even if you do know "the initial ordering of the data" sufficiently to make decisions relative to that criteria, that information alone may not be sufficient to gain any real, stable performance.

Your interpretation of my point really isn't far removed from my actual point. I am indeed speaking to the practicality of "finely grained tuning" based on situational expectations. I'm only going a lot further by saying that the practicality of "finely grained tuning" is largely negated by the abundance of different criteria rather than the conditions of any specific criteria.

Soma
Okay, I think I can follow that close enough to check that we are more or less in agreement. 13. Originally Posted by rcgldr The same is true of a natural bottom up merge sort. With a merge sort that uses compare to determine the end of sorted runs, if a significant amount of data is approximately in the correct location within runs as opposed to actual ascending (or descending) sequences, then the number of merge passes will be reduced as runs or groups will get automatically concatenated.
Oh, a natural merge sort, where the lengths of the runs to merge are determined by the lengths of the runs that are naturally present in the data.
Yes sorry, you are in fact correct there. It can certainly be implemented that way.

Natural Merge Sort improves the best case of Merge Sort, but it is definitely at the expense of the average case, and especially the worst case.

Probably the best comparison-based sorting algorithm that adapts to sorted, reverse-sorted, or any other initial order, is Splay Tree Sort. Worst case is O(log n), as guaranteed by the proof of splay tree's themselves, and best case is O(n). Surprisingly, even orderings with any obvious sortedness to them, also produce near O(n) running times.
If the OP is primarily after something that is adaptive in this manner, then it would no doubt be the best option in terms of number of coparisons. The downside is really just in memory usage, where even if a memory pool is used, tree nodes would take up quite a bit more space that a mere int. 14. Originally Posted by iMalc Natural Merge Sort improves the best case of Merge Sort, but it is definitely at the expense of the average case, and especially the worst case.
Depends on how it's implemented. If stability (preserving the original order) is not required, then using compares to determine the end of runs ends up treating multiple runs as a single run (effectively concatenating the runs) if the last record of one run is in sequence with the first record of the next run. For a stable natural merge sort, one implementation is to use an array of counts to keep track of run sizes, to avoid treating separate runs as a single run. In either case, after the initial pass, the minimum run size will be 2, reversing descending sequences in place (as opposed to a copy to the alternate buffer). For random data, the initial pass will do swaps on about 1/2 the data (the other 1/2 will be in order). For the stable natural merge with array of counts, the initial size of that array can have 1/2 the number of elements in the array to be sorted, so there is an overhead, but with each merge pass that count array is halved in size. 15. Originally Posted by rcgldr Depends on how it's implemented. If stability (preserving the original order) is not required, then using compares to determine the end of runs ends up treating multiple runs as a single run (effectively concatenating the runs) if the last record of one run is in sequence with the first record of the next run. For a stable natural merge sort, one implementation is to use an array of counts to keep track of run sizes, to avoid treating separate runs as a single run. In either case, after the initial pass, the minimum run size will be 2, reversing descending sequences in place (as opposed to a copy to the alternate buffer). For random data, the initial pass will do swaps on about 1/2 the data (the other 1/2 will be in order). For the stable natural merge with array of counts, the initial size of that array can have 1/2 the number of elements in the array to be sorted, so there is an overhead, but with each merge pass that count array is halved in size.
Yeah but the main decrease in efficiently comes from merges having significantly unequal numbers of items to merge. Popular pages Recent additions 