Originally Posted by

**rcgldr**
I never made any claims about O(1).

Oh, I just thought that was the point of contention here.

Is this similar to the implementation you're thinking about for linked lists? (This is the C board, not the C++ one, so please don't use C++ here.)

Code:

typedef struct list_st list_t;
struct list_st {
list_t *next;
double key;
};
/* How many parallel lists to maintain.
*/
#define RANKS 30
/* Merge two lists, return root of new list.
*/
static list_t *merge(list_t *left, list_t *right)
{
list_t *root = (list_t *)0, *curr;
if (!left)
return right;
if (!right)
return left;
if (left->key <= right->key) {
curr = root = left;
left = left->next;
} else {
curr = root = right;
right = right->next;
}
while (left && right)
if (left->key <= right->key) {
curr->next = left;
curr = left;
left = left->next;
} else {
curr->next = right;
curr = right;
right = right->next;
}
if (left)
curr->next = left;
else
if (right)
curr->next = right;
else
curr->next = (list_t *)0;
return root;
}
/* Nonrecursive fixed-memory merge sort.
* Returns the root of the sorted list.
*/
list_t *sort(list_t *list)
{
list_t *rank[RANKS] = { (list_t *)0 }; /* All zeros per C rules. */
list_t *curr;
int i;
if (!list)
return (list_t *)0;
while (list) {
curr = list;
list = list->next;
curr->next = (list_t *)0;
if (rank[0]) {
curr = merge(curr, rank[0]);
for (i = 1; i < (int)RANKS - 1 && rank[i]; i++) {
curr = merge(curr, rank[i]);
rank[i-1] = (list_t *)0;
}
rank[i-1] = (list_t *)0;
rank[i] = curr;
} else
rank[0] = curr;
}
/* Find longest existing list. */
i = RANKS - 1;
while (i > 0 && !rank[i])
i--;
/* Merge down. */
for (; i > 0; i--)
rank[i-1] = merge(rank[i-1], rank[i]);
/* Done. */
return rank[0];
}

Having a smaller *RANKS* will not limit the input list length, only a slow down when the input list length exceeds 2^{RANKS} elements.

Since each node must have a pointer at least, it makes no sense to use *RANKS* greater than 30 for 32-bit architectures, or 61 for 64-bit architectures.

Technically, the *rank* array contains sublists of doubling lengths. The length of the original list is not needed nor useful. Recursion is replaced by the usage of the *rank* array.

The list is chopped up into one-element-long sublists, and merged into exponentially longer lists as needed.

I can see the performance benefit compared to a typical top-down approach: this avoids the traversal of the original list to find the node at which to split, because *every* node is split from the original list. The work done to merge the sublists is the same in both cases.

One can also split the original list on input into fixed-length sorted sublists, and stick those into *rank*[0] instead. That seems to help with shorter lists.

Longer lists seem to be memory access limited (poor cache locality); sorting a temporary array (containing the key and the pointer to the actual node) and rechaining the nodes using the temporary array should yield much better cache locality, and therefore faster run times -- but I have not verified this.