Agreed on the shift rather than swap. I take it one step further and don't do either if the first comparison indicates it wont need to move at all.

Here's the one off my website. This is using one of the best, if not **the** best, known gap sequences:
Code:

template <class T>
void ShellSort(T a[], int n) {
//Incerpj-Sedgewick 1985
const int incs[] = {
1, 3, 7, 21, 48, 112,
336, 861, 1968, 4592, 13776,
33936, 86961, 198768, 463792, 1391376,
3402672, 8382192, 21479367, 49095696, 114556624,
343669872, 52913488, 2085837936};
for (int l = sizeof(incs)/sizeof(incs[0]); l > 0;) {
const int m = incs[--l];
for (int i = m; i < n; ++i) {
int j = i - m;
if (a[i] < a[j]) { //check if the item needs inserting first (eliminates redundant copying)
const T tempItem = a[i];
do {
a[j+m] = a[j];
j-=m;
} while ((j >= 0) && (tempItem < a[j]));
a[j+m] = tempItem;
}
}
}
}

Well it's C++, but I'm sure that doesn't really change anything here.

The extra 'if' makes it faster for when the type being sorted isn't as small or trivial as an int.

I find that it's much easier to understand if you compare it to an insertion sort implementation.

Here's mine which is obviously implemented in the same style:
Code:

template <class T>
void InsertionSort(T a[], int n) {
for (int i = 1; i < n; ++i) {
//check if the item needs inserting first (eliminates redundant copying)
if (a[i] < a[i - 1]) {
int j = i - 1;
const T tempItem = a[i];
//move items along while looking for the correct place
do {
a[j + 1] = a[j];
--j;
} while ((j >= 0) && (tempItem < a[j]));
//copy item into position
a[j + 1] = tempItem;
}
}
}

If you can understand the later then you should be able to understand the former.