Thread: A tale of two sorts: classifying an inefficient sorting algorithm

  1. #1
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413

    A tale of two sorts: classifying an inefficient sorting algorithm

    This thread has branched off from wufflz's Bubblesorting Array elements in order to avoid going too far off track for that discussion. It makes reference to Owen Astrachan's Bubble Sort: An Archaeological Algorithmic Analysis as shared by Hodor.

    I'm mostly interested in the rather academic question of whether a group of rather inefficient sorts -- that really shouldn't be so heartily promoted by instructors but appears to be -- should be categorised as a kind of bubble sort, a kind of insertion sort, both, or something else. These sorts are basically O(n**2) sorts that do adjacent swapping of out-of-order items akin to typical bubble sorts, but instead of terminating as soon as no more swaps are done in the inner loop, they just keep looping for about n outer loop iterations in total, similiar to how insertion sorts would consider each item in the list, building a sorted sublist.

    Quote Originally Posted by Hodor
    I have the book open in front of me now (I needed to re-read it based on your feedback) and Knuth does not state that terminating early is an optimization of bubble sort but an intrinsic part of it (pages 106-107). The only algorithm he gives (Fig 15 and Program B, p. 107) is based on the text discussion and has no swaps occurring as the terminating condition. In the section "refinements of the bubble sort" (pp. 109-110) Knuth does not present terminating early as a refinement because he's already said that the algorithm ends when no exchanges occur (discussion on pages 106-107 and Fig. 15). So it's not a refinement or an optimization according to my interpretation of Knuth.
    Hmm... so my next question would be: how on earth did Astrachan arrive at that from Knuth's description? Could it be from the 1st edition, and he sloppily didn't check before citing the 2nd edition?

    I suspect that Knuth might not have altered his description, because Astrachan does provide a lead:
    Despite these earlier publications the algorithm officially enters the ACM algorithm repository as Algorithm 175 [25] in 1963 where it is named Shuttle Sort. Soon thereafter [14] the published algorithm is found to be ``not free from errors'', a gentle way of saying the published code is wrong. There a modified version of the code (that stops early when no swaps are made) is given and the author says that in this form the code was ``studied in this form on the ORDVAC computer, Aberdeen Proving Ground, in 1955.''
    Remember that you mentioned Knuth's "straight insertion sort"? This mention of ACM algorithm 175 jogged my memory. Here's a note from DADS entry for insertion sort:
    Note: Sorting can be done in place by moving the next item into place by repeatedly swapping it with the preceding item until it is in place - a linear search and move combined. This implementation is given in C. J. Shaw and T. N. Trimble, Algorithm 175 Shuttle Sort, CACM, 6(6):312-313, June 1963.
    I've looked up Algorithm 175, and it's quite obvious that there are errors, given a list of numbers N from index 1 to m:
    Code:
    begin integer i, j;
        for i := 1 step 1 until m - 1 do
        begin
            for j := i step -1 until 1 do
            begin
                if N[j] < N[j+i] then go to Test;
                Exchange:
                Temporary := N[j];
                N[j] := N[j+i];
                N[j+i] := Temporary;
            end of j loop;
        Test:
        end of i loop
    end shuttle sort
    Here's Juelich's comment on this algorithm:
    The algorithm as published is not free from errors. The statement
    Code:
    for j := i step -1 until 1 do
    should be replaced by either:
    Code:
    for j := m - 1 step -1 until i do
    or
    Code:
    for j := 1 step 1 until m - i do
    In the former case the process can be visualized as placing the ith smallest element in place on the ith pass; in the latter the ith largest element is put in place on the ith pass.
    The label "Test" should precede the delimiter "end of j loop" rather than the "end of i loop".
    If we choose option 2, we end up with:
    Code:
    begin integer i, j;
        for i := 1 step 1 until m - 1 do
        begin
            for j := 1 step 1 until m - i do
            begin
                if N[j] < N[j+i] then go to Test;
            Exchange:
                Temporary := N[j];
                N[j] := N[j+i];
                N[j+i] := Temporary;
            Test:
            end of j loop;
        end of i loop
    end shuttle sort
    This is equivalent to Astrachan's section 2.1 algorithm, except that the outer loop counts up instead of down, but ends up the same because the inner loop achieves the same iterative sequence by computing m - i instead of directly using the outer loop index. Oh, and Shaw & Trimble use a goto to the end of the loop body whereas Astrachan just inverts the loop condition, but their goto is no more harmful than a continue.

    Juelich proceeds to provide remarks on a variant of this algorithm, as Astrachan mentioned:
    The algorithm can be slightly accelerated by rewriting the body of the procedure:
    Code:
    begin integer i, j, j_max;
    i := m - 1;
    loop:
        j_max := 1;
        for j := 1 step 1 until i do
        begin
        compare:
            if N[j] > N[j + 1] then
            begin Exchange:
                Temporary := N[j];
                N[j] := N[j + 1];jufh
                N[j + 1] := Temporary;
                j_max := j
            end Exchange;
        end of j loop;
        i := j_max;
        if i > 1 then go to loop;
    end shuttle sort
    This is a bubble sort, optimised such that the inner loop loops until the most recent swap location, at the cost of an extra assignment on each swap (which isn't really extra because we would set a swap flag otherwise), with the inner loop repeated until no more swaps occur.

    So, what is ACM Algorithm 175? Besides naming it "shuttle sort", Shaw & Trimble introduced it thus: "this procedure sorts the list of numbers N[1] through N[m] into numeric order, by exchanging out-of-order number pairs."

    To me, the essence of bubble sort involves precisely this: "exchanging out-of-order number pairs" where "pairs" refer to adjacent items, whereas insertion sort involves "repeatedly taking the next item and inserting it into the final data structure in its proper order with respect to items already inserted" (DADS definition).

    Combined with the fact that Juelich names a bubble sort as a variant of Algorithm 175, this sounds like Shaw, Trimble, and Juelich would agree that Algorithm 175 is a bubble sort.

    Yet, this reference was found in the DADS entry for insertion sort, and apparently is equivalent to Knuth's "straight insertion sort" (although it certainly predates it).

    I'll need to have a further think before drawing any conclusions, but for now this is where my research has taken me.
    Last edited by laserlight; 12-11-2019 at 04:59 AM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Detecting and classifying information with a txt file.
    By GhettoBurger in forum C Programming
    Replies: 4
    Last Post: 07-11-2018, 07:57 AM
  2. Replies: 0
    Last Post: 10-09-2017, 02:45 PM
  3. Classifying characters (was struct:)
    By asteroidv2 in forum C++ Programming
    Replies: 18
    Last Post: 07-16-2006, 04:20 AM
  4. sorting algorithm
    By RazielX in forum C Programming
    Replies: 4
    Last Post: 05-04-2004, 06:20 PM
  5. sorting of sorts....hehe
    By newbie2C++ in forum C++ Programming
    Replies: 2
    Last Post: 11-14-2001, 01:02 AM

Tags for this Thread