# Thread: A tale of two sorts: classifying an inefficient sorting algorithm

1. ## A tale of two sorts: classifying an inefficient sorting algorithm

This thread has branched off from wufflz's Bubblesorting Array elements in order to avoid going too far off track for that discussion. It makes reference to Owen Astrachan's Bubble Sort: An Archaeological Algorithmic Analysis as shared by Hodor.

I'm mostly interested in the rather academic question of whether a group of rather inefficient sorts -- that really shouldn't be so heartily promoted by instructors but appears to be -- should be categorised as a kind of bubble sort, a kind of insertion sort, both, or something else. These sorts are basically O(n**2) sorts that do adjacent swapping of out-of-order items akin to typical bubble sorts, but instead of terminating as soon as no more swaps are done in the inner loop, they just keep looping for about n outer loop iterations in total, similiar to how insertion sorts would consider each item in the list, building a sorted sublist.

Originally Posted by Hodor
I have the book open in front of me now (I needed to re-read it based on your feedback) and Knuth does not state that terminating early is an optimization of bubble sort but an intrinsic part of it (pages 106-107). The only algorithm he gives (Fig 15 and Program B, p. 107) is based on the text discussion and has no swaps occurring as the terminating condition. In the section "refinements of the bubble sort" (pp. 109-110) Knuth does not present terminating early as a refinement because he's already said that the algorithm ends when no exchanges occur (discussion on pages 106-107 and Fig. 15). So it's not a refinement or an optimization according to my interpretation of Knuth.
Hmm... so my next question would be: how on earth did Astrachan arrive at that from Knuth's description? Could it be from the 1st edition, and he sloppily didn't check before citing the 2nd edition?

I suspect that Knuth might not have altered his description, because Astrachan does provide a lead:
Despite these earlier publications the algorithm officially enters the ACM algorithm repository as Algorithm 175 [25] in 1963 where it is named Shuttle Sort. Soon thereafter [14] the published algorithm is found to be ``not free from errors'', a gentle way of saying the published code is wrong. There a modified version of the code (that stops early when no swaps are made) is given and the author says that in this form the code was ``studied in this form on the ORDVAC computer, Aberdeen Proving Ground, in 1955.''
Remember that you mentioned Knuth's "straight insertion sort"? This mention of ACM algorithm 175 jogged my memory. Here's a note from DADS entry for insertion sort:
Note: Sorting can be done in place by moving the next item into place by repeatedly swapping it with the preceding item until it is in place - a linear search and move combined. This implementation is given in C. J. Shaw and T. N. Trimble, Algorithm 175 Shuttle Sort, CACM, 6(6):312-313, June 1963.
I've looked up Algorithm 175, and it's quite obvious that there are errors, given a list of numbers N from index 1 to m:
Code:
```begin integer i, j;
for i := 1 step 1 until m - 1 do
begin
for j := i step -1 until 1 do
begin
if N[j] < N[j+i] then go to Test;
Exchange:
Temporary := N[j];
N[j] := N[j+i];
N[j+i] := Temporary;
end of j loop;
Test:
end of i loop
end shuttle sort```
Here's Juelich's comment on this algorithm:
The algorithm as published is not free from errors. The statement
Code:
`for j := i step -1 until 1 do`
should be replaced by either:
Code:
`for j := m - 1 step -1 until i do`
or
Code:
`for j := 1 step 1 until m - i do`
In the former case the process can be visualized as placing the ith smallest element in place on the ith pass; in the latter the ith largest element is put in place on the ith pass.
The label "Test" should precede the delimiter "end of j loop" rather than the "end of i loop".
If we choose option 2, we end up with:
Code:
```begin integer i, j;
for i := 1 step 1 until m - 1 do
begin
for j := 1 step 1 until m - i do
begin
if N[j] < N[j+i] then go to Test;
Exchange:
Temporary := N[j];
N[j] := N[j+i];
N[j+i] := Temporary;
Test:
end of j loop;
end of i loop
end shuttle sort```
This is equivalent to Astrachan's section 2.1 algorithm, except that the outer loop counts up instead of down, but ends up the same because the inner loop achieves the same iterative sequence by computing m - i instead of directly using the outer loop index. Oh, and Shaw & Trimble use a goto to the end of the loop body whereas Astrachan just inverts the loop condition, but their goto is no more harmful than a continue.

Juelich proceeds to provide remarks on a variant of this algorithm, as Astrachan mentioned:
The algorithm can be slightly accelerated by rewriting the body of the procedure:
Code:
```begin integer i, j, j_max;
i := m - 1;
loop:
j_max := 1;
for j := 1 step 1 until i do
begin
compare:
if N[j] > N[j + 1] then
begin Exchange:
Temporary := N[j];
N[j] := N[j + 1];jufh
N[j + 1] := Temporary;
j_max := j
end Exchange;
end of j loop;
i := j_max;
if i > 1 then go to loop;
end shuttle sort```
This is a bubble sort, optimised such that the inner loop loops until the most recent swap location, at the cost of an extra assignment on each swap (which isn't really extra because we would set a swap flag otherwise), with the inner loop repeated until no more swaps occur.

So, what is ACM Algorithm 175? Besides naming it "shuttle sort", Shaw & Trimble introduced it thus: "this procedure sorts the list of numbers N[1] through N[m] into numeric order, by exchanging out-of-order number pairs."

To me, the essence of bubble sort involves precisely this: "exchanging out-of-order number pairs" where "pairs" refer to adjacent items, whereas insertion sort involves "repeatedly taking the next item and inserting it into the final data structure in its proper order with respect to items already inserted" (DADS definition).

Combined with the fact that Juelich names a bubble sort as a variant of Algorithm 175, this sounds like Shaw, Trimble, and Juelich would agree that Algorithm 175 is a bubble sort.

Yet, this reference was found in the DADS entry for insertion sort, and apparently is equivalent to Knuth's "straight insertion sort" (although it certainly predates it).

I'll need to have a further think before drawing any conclusions, but for now this is where my research has taken me.