# I'm dreaming of making insertion sort O(nlogn)

This is a discussion on I'm dreaming of making insertion sort O(nlogn) within the C++ Programming forums, part of the General Programming Boards category; Originally Posted by RichSelian The std::list<> also slow down the speed because we need singly linked list but it provides ...

1. Originally Posted by RichSelian
The std::list<> also slow down the speed because we need singly linked list but it provides doubly linked list. I use a vector to emulate the list and get about 20% time improvement.
That doesn't make sense. If you needed a singly linked list, a doubly linked list provides that by definition without a performance cost. It just means processing in one direction.

2. Okay, my more thorough analysis, since I found the time:

As I suspected, this is using the same concept as OrderSort, but implemented less efficiently in a few ways. The main difference is that rather than dividing up the search for the insertion point into two approximately sqrt(n) search phases, it divides the search for the insertion point into log(n) and n/log(n) search phases.
The log(n) time is spent in the binary search in the first while loop, but crucially the remaining n/log(n) time is spent in the last while loop. So the last while loop becomes the bottleneck here.

I deduced the algorithm to be O(n^2/log(n)) worst case by examination.
Then I confirmed this empirically by examining the comparison counts for the worst case that I found and mentioned earlier. After including the comparisons inside the min_element and max_element calls, the results lined up almost perfectly with my predicted number of comparisons for various n. Thus I can say without doubt that it is O(n^2/log(n)).

OrderSort is O(n^1.5) worst case, though through with some luck in its granular search phase it typically performs at around O(n^1.41). In your case, having good luck in the second while loop means reducing the sort time even more substantially and will sometimes bring it down towards (n*log(n)). Your algorithm can therefore often win over OrderSort in the randomly ordered case, though it loses badly in say the sorted case. Quite simply OrderSort is more consistent.

The authors of OrderSort recognised that to strike the best balance between the coarse and the granular searches, the optimal number of searches was sqrt(n) for each. If one wants to divide n by some number say k, and one wishes to minise n/k + k, then the optimal solution is for k to equal sqrt(n). Thus they pay for a fixed sqrt(n) steps initially, and then gamble on the remaning sqrt(n) steps. You have chosen a k of log(n). This means that you pay for exactly log(n) search steps initially, and gamble on the remaining n/log(n) steps.

Your algorithm also bears some similarity to LibrarySort. This algorithm and many more are shown on my website.

3. I got different results. I had [...] that it performed a stable sort.
How are you confirming a stable sort?

I did modify the code to [...] area just suck.
I didn't make any changes to his code. The only change I needed to make was in coding a wrapper to transform a simple array into a `std::vector<???>' before calling his function.

What data type are you sorting?
My benchmark suite uses an `octet' type. The sample below uses a simple `pair'.

I try to sort 100,000 random [...] wrong answers nor exceptions.
I'm using only nine elements to force a bad sort and raise an exception.

I didn't check the stability because the test script is difficult to write.
Nope. It is easy. You're just lazy.

Or maybe there are bugs hiding in the code, but at least the theory is correct.
From what I've seen, the algorithm is sound; it just doesn't represent a stable algorithm with O(n * log(n)) worse case time complexity.

By the way, could you provide your test script for me?
I can provide a test. My benchmark suite is simply too complex to post.

I'll provide you some source that forces a bad sort and raises an exception, but I'm not going to provide a testbed for you. That's your job. I'm only posting the source I'm posting because you have some misconceptions about your source.

Code:
```!got:
(179,0) (179,1) (179,2) (9172,3) (9172,4) (9172,5) (4334,6) (4334,7) (4334,8)
(179,2) (179,1) (179,0) (4334,7) (4334,8) (4334,6) (9172,4) (9172,5) (9172,3)
~got:
!expected:
(179,0) (179,1) (179,2) (9172,3) (9172,4) (9172,5) (4334,6) (4334,7) (4334,8)
(179,0) (179,1) (179,2) (4334,6) (4334,7) (4334,8) (9172,3) (9172,4) (9172,5)
~expected:
!trying:
(15897,0) (15897,1) (15897,2) (580,3) (580,4) (580,5) (1360,6) (1360,7) (1360,8)
(epic, fail)
~trying:```
These are the results my farm logged. (The results were always the same.)

I've included your posted source here; I assure you that it is unmodified.

Soma

Code:
```// #############################################################################
// # FileName:  A-Bookmark.cpp
// # Author:    Zhang Li, Class CS0901
//              Huazhong University of Science & Technology
// #############################################################################
#include <cstdio>
#include <cstdlib>
#include <vector>
#include <list>
#include <algorithm>

using namespace std;

// # sorting algorithm begin
// #############################################################################
template<typename DataType> class BookmarkSorter
{
public:
static void sort(vector<DataType>& elements)
{
//
// elementList is the linked list
// bookmarks is an array whose elements are pointers to nodes of the linked list
//  the data structure costs O(n) space complexity
//
// we will first perform a binary search on bookmarks, find the position for insertion (costs O(nlogn))
//  the total time cost of this part is O(nlogn)
//
// then we will insert the data into the linked list (costs O(1))
//  the total time cost of this part is (n * O(1)) == O(n)
//
// when we have inserted a certern amount of data into the linked list,
//  the bookmark can no longer represent the linked list,
//  and the time complexity of inserting a data will increase to O(n)
//
// so it is time to "update" the bookmark. It is very easy because we just simply save the pointers to
//  each node of the linked list.
//  this happens when this size of linked list is doubled (when we have inserted 1, 2, 4, 8... elements)
//  the total time cost of this part is (1 + 2 + 4 + ... + n) == O(n)
//
//  the total time complexity of this algorithm is O(nlogn)
//

list<DataType> elementList;
vector<typename list<DataType>::iterator> bookmarks;
typename list<DataType>::iterator scanIterator, insIterator;

if(elements.size() > 1)
{
// we insert the minimum and maximum data first to reduce the problem in binary searching
swap(*min_element(elements.begin(), elements.end()), elements.at(0));
swap(*max_element(elements.begin(), elements.end()), elements.at(1));

elementList.push_back(elements.at(0));
elementList.push_back(elements.at(1));

// insert other data
for(int scanPos = 2; scanPos < elements.size(); scanPos++)
{
// check whether the bookmark needs to update
if(scanPos >= bookmarks.size() * 2)
{
bookmarks.clear();
for(scanIterator = elementList.begin(); scanIterator != elementList.end(); scanIterator++)
bookmarks.push_back(scanIterator);
}

// perform a binary search to find the insert position
int balanceNum = 0, middle, left = 0, right = bookmarks.size() - 1;
while(left <= right)
{
middle = (left + right) / 2;
elements.at(scanPos) < *bookmarks.at(middle) ? right = middle - 1 : left = middle + 1;
}
insIterator = bookmarks.at(right);

// take a linear search to find the real insert position in the list, costs only O(1)
while(elements.at(scanPos) > *insIterator)
{
insIterator++;
// I added the following two lines to auto-balance the bookmark
// so the worst time complexity is expected to be O(nlogn), but I cannot prove it
if(bookmarks.at(right) != elementList.begin() && !(balanceNum++) & 1 == 1)
bookmarks.at(right)++;
}

// now insert data
elementList.insert(insIterator, elements.at(scanPos));
}

elements = vector<DataType>(elementList.begin(), elementList.end());
}
return;
}
};
// #############################################################################
// # sorting algorithm end

#include <iostream>
#include <iterator>

class tester
{
public:
tester():
x_m(0),
y_m(0)
{
}
tester
(
int x_f,
int y_f
):
x_m(x_f),
y_m(y_f)
{
}
tester
(
const tester & rhs
):
x_m(rhs.x_m),
y_m(rhs.y_m)
{
}
~tester()
{
}
tester & operator =
(
const tester & rhs
)
{
x_m = rhs.x_m;
y_m = rhs.y_m;
return(*this);
}
public:
public:
int x_m;
int y_m;
public:
};

bool operator >
(
const tester & lhs,
const tester & rhs
)
{
return(lhs.x_m > rhs.x_m);
}

bool operator <
(
const tester & lhs,
const tester & rhs
)
{
return(lhs.x_m < rhs.x_m);
}

std::ostream & operator <<
(
std::ostream & lhs,
const tester & rhs
)
{
lhs << '(' << rhs.x_m << ',' << rhs.y_m << ')';
return(lhs);
}

void test1
(
const std::vector<tester> & data
)
{
using namespace std;
cout << "!got:\n";
{
vector<tester> acopy(data);
cout << '\t';
copy(acopy.begin(), acopy.end(), ostream_iterator<tester>(cout, " "));
cout << '\n';
BookmarkSorter<tester>::sort(acopy);
cout << '\t';
copy(acopy.begin(), acopy.end(), ostream_iterator<tester>(cout, " "));
cout << '\n';
}
cout << "~got:\n";
cout << "!expected:\n";
{
vector<tester> acopy(data);
cout << '\t';
copy(acopy.begin(), acopy.end(), ostream_iterator<tester>(cout, " "));
cout << '\n';
stable_sort(acopy.begin(), acopy.end());
cout << '\t';
copy(acopy.begin(), acopy.end(), ostream_iterator<tester>(cout, " "));
cout << '\n';
}
cout << "~expected:\n";
}

void test2
(
const std::vector<tester> & data
)
{
using namespace std;
cout << "!trying:\n";
vector<tester> acopy(data);
cout << '\t';
copy(acopy.begin(), acopy.end(), ostream_iterator<tester>(cout, " "));
cout << '\n';
try
{
BookmarkSorter<tester>::sort(acopy);
cout << '\t';
copy(acopy.begin(), acopy.end(), ostream_iterator<tester>(cout, " "));
}
catch(...)
{
cout << "\t(epic, fail)";
}
cout << '\n';
cout << "~trying:\n";
}

int main()
{
using namespace std;
const unsigned int size1(9);
const unsigned int size2(9);
tester data1[size1] = {tester(179, 0), tester(179, 1), tester(179, 2), tester(9172, 3), tester(9172, 4), tester(9172, 5), tester(4334, 6), tester(4334, 7), tester(4334, 8)};
tester data2[size2] = {tester(15897, 0), tester(15897, 1), tester(15897, 2), tester(580, 3), tester(580, 4), tester(580, 5), tester(1360, 6), tester(1360, 7), tester(1360, 8)};
test1(vector<tester>(data1, data1 + size1));
test2(vector<tester>(data2, data2 + size2));
return(0);
}```

4. it divides the search for the insertion point into log(n) and n/log(n) search phases.
No, it isn't. The insertion point is divided into 1, 2, 4, 8 ... parts.

In your case, having good luck in the second while loop means reducing the sort time even more substantially and will sometimes bring it down towards (n*log(n)).
It's not luck. It is expected to be O(1) for random data.

I'll go to you website to have a look at order sort...

5. No, it isn't. The insertion point is divided into 1, 2, 4, 8 ... parts.
Just like what library does. But I don't want to talk about library sort. It is a monster to be implemented.

6. Well, I don't see my algorithm has anything related to order sort. There's not any "sqrt" in my algorithm. It is real O(nlogn) for random data.
By the way, order sort performs poor in my real-time test (I don't know why).

Code:
```Sorting 50000 random double data
================ Test Result ================
Size Filename          Time		Space
2509 A-Quick.cpp       0.088s		727k
3029 A-Bookmark.cpp    0.100s		1597k
2450 A-Merge.cpp       0.112s		803k
2464 B-Pool.cpp        0.196s		912k
2619 B-Strand.cpp      0.428s		731k
1603 S-Insertion.cpp   2.100s		724k
2909 OrderSort.cpp     2.728s		784k```

7. Code:
```                // perform a binary search to find the insert position
int balanceNum = 0, middle, left = 0, right = bookmarks.size() - 1;
while(left <= right)
{
middle = (left + right) / 2;
elements.at(scanPos) < *bookmarks.at(middle) ? right = middle - 1 : left = middle + 1;
}
insIterator = bookmarks.at(right);```
phantomotap is right. This binary search can end up setting right to -1.

Congratulations on using at(), though.

8. Originally Posted by anon
[code]
Congratulations on using at(), though.
Yes it's all good using .at() oftentimes, though I'd consider it overused a little here.

Phantomotap, I love your "(epic, fail)" output! Thanks for showing that my tests must be grossly inadequate. I suspected as much. Will recify this asap. I tested stability the same awy you did, with data containing many duplicates and an extra parameter which is initially in sorted order.

it divides the search for the insertion point into log(n) and n/log(n) search phases.
No, it isn't. The insertion point is divided into 1, 2, 4, 8 ... parts.
I'm not talking about just the binary search. The binary search clearly takes exactly log(n) steps, but what you fail to realise if that the next while loop can take far longer than that, up to n/log(n) steps in fact. Try putting a counter in each while loop, and then output the counts afterwards. Start with an array that is sorted, except that the halves are switched. E.g. it would look like this:
Code:
```     _
_| |
_| | |
| | | |    _
| | | |  _| |
| | | |_| | |
|4|5|6|1|2|3|```
In your case, having good luck in the second while loop means reducing the sort time even more substantially and will sometimes bring it down towards (n*log(n)).
It's not luck. It is expected to be O(1) for random data.
Then your expectations are simply wrong and don't agree with reality. I had times where near the end of sorting 512 items that it sometimes looped over 60 times through the while loop, for one iteration of the outer loop. n/log(n) is 73, and it reached about that at the worst. That is not constant time!

9. Thanks to phantomotap. I didn't notice that more than one maximum or minimum data will make the algorithm std:ut_or_range. I have fixed this bug and I believe I have made it stable (have passed your test). This is the fixed algorithm.
Code:
```// #############################################################################
// # FileName:  A-Bookmark.cpp
// # Author:    Zhang Li, CS0901, HUST
// #############################################################################
#include <cstdio>
#include <cstdlib>
#include <vector>
#include <list>
#include <algorithm>

using namespace std;

// # sorting algorithm begin
// #############################################################################
template<typename DataType> class BookmarkSorter
{
public:
static void sort(vector<DataType>& elements)
{
list<DataType> elementList;
vector<typename list<DataType>::iterator> bookmarks;
typename list<DataType>::iterator i;
int t = 0;
for(size_t scanPos = 0; scanPos < elements.size(); scanPos++)
{
if(scanPos >= bookmarks.size() * 2)
for(bookmarks.clear(), i = elementList.begin(); i != elementList.end(); i++)
bookmarks.push_back(i);

if(elements.at(scanPos) < elementList.front())
{
elementList.push_front(elements.at(scanPos));
if(bookmarks.size() > 0)
bookmarks.front() = elementList.begin();
}
else if(elements.at(scanPos) >= elementList.back())
{
elementList.push_back(elements.at(scanPos));
}
else
{
size_t middle, left = 1, right = bookmarks.size() - 1;
while(left <= right)
{
middle = (left + right) / 2;
elements.at(scanPos) >= *bookmarks.at(middle) ? left = middle + 1 : right = middle - 1;
}
for(i = bookmarks.at(right); *i <= elements.at(scanPos); i++)
t++;;
elementList.insert(i, elements.at(scanPos));
}
}
elements = vector<DataType>(elementList.begin(), elementList.end());
fprintf(stderr, "EE %d ", t);
return;
}
};

// #############################################################################
// # sorting algorithm end

int main(int argc, char** argv)
{
int n;
double data;
vector<double> elements;

// read data from input stream
scanf("%d", &n) != 1 && (exit(-1), 0);
for(int i = 0; i < n; i++)
{
scanf("%lf", &data) != 1 && (exit(-1), 0);
elements.push_back(data);
}

// sort elements
BookmarkSorter<double>::sort(elements);

// write sorted data to output stream
for(int i = 0; i < n; i++)
printf("%lf\n", elements.at(i));

return 0;
}
// #############################################################################
// # EOF```

10. Oh, sorry. The variant "t" is used only for testing but I forgot to remove it.

to iMalc:
Then your expectations are simply wrong and don't agree with reality.
For each data, times of the while loop should be expected to be less than 1.5. It is easy to prove. And the reality shows it is right.
The last line of my code prints total times of the while loop.
I test the code with 10,000 data for 10 times and this is the result:
EE 14350
EE 14525
EE 14213
EE 14422
EE 14391
EE 14525
EE 14289
EE 14188
EE 14159
EE 14400
All test shows the while loop costs lesser than O(1.5n)

11. Originally Posted by RichSelian
Oh, sorry. The variant "t" is used only for testing but I forgot to remove it.

to iMalc:

For each data, times of the while loop should be expected to be less than 1.5. It is easy to prove. And the reality shows it is right.
The last line of my code prints total times of the while loop.
I test the code with 10,000 data for 10 times and this is the result:
EE 14350
EE 14525
EE 14213
EE 14422
EE 14391
EE 14525
EE 14289
EE 14188
EE 14159
EE 14400
All test shows the while loop costs lesser than O(1.5n)
You simply aren't listening.
What are those numbers and what does EE mean? Where are your flippin units man! Without them the above is all meaningless. In fact if these are time values rather than operation counts then they are meaningless and unrepeatable anyway.
I have told you that the number of iterations of the second while loop, for data ordered such that the first halves are swapped. I now refer you the following table, generated by putting a counter in each while loop:
Code:
```Number    First inner loop  Second inner loop
of items: iterations:       iterations:
8         12                9
16        37                40
32        102               154
64        263               576
128       648               2190
256       1545              8492
512       3594              33386
1024      8203              132328```
This is the worst case I have found.
Do this yourself, start with say 16 numbers (9,10,11,12,13,14,15,16,1,2,3,4,5,6,7,8) and confirm the above table values before you even think about disagreeing. I'm not making this up, these numbers are coming from your two inner loops. You need to approach this scientifically, being skeptical of your own results.

Yes the number of iterations for the second loop is significantly lower for random data, for example with 128 items the second loop iteration count can be 214, or just a little over 1.6 * N.
You are arguing about best case (and perhaps convincing yourself that it is the worst case), but Big-Oh notation concerns itself with the worst case, and the worst known case thus far, is shown in the table. Clearly the second while loop can take about n/log(n) operations.

What's more, I'm not even sure that I've found the absolute worst case, there may be a case even worse than this!

12. Sorry, I misunderstood your meaning. I just think that you are saying about the average case.

The worse case is O(n^2). It is true. I have tried many ways to solve it but failed. I deleted all "optimization" code in my last code.

13. Originally Posted by RichSelian
Sorry, I misunderstood your meaning. I just think that you are saying about the average case.

The worse case is O(n^2). It is true. I have tried many ways to solve it but failed. I deleted all "optimization" code in my last code.
You really have no idea what big-O notation even is, do you? "Worst case O(...)"... "Average case O(...)".

See Big O notation - Wikipedia, the free encyclopedia
O(...) IS the worst case scenario per definition.

EDIT: Ahh, ........ me. I was wrong here. Thanks laserlight :P. I should read my own links a bit better anyway.
"the worst case or average case running time or memory usage of an algorithm is often expressed as a function of the length of its input using big O notation."
I was confused with the limiting of functions. Crap.
Bring on the flames ;-). I deserve them for this :P.

14. I've also run into fustration when implementing a new sorting algorithm where it works great until you try a particular case, and try as you might, there just isn't a practical way of dealing with that case within the context of the algorithm. However, that does not mean that it is not useful. Most of the sorting algorithms out there have their usefulness, because minimum, maximum, or average time taken isn't the only concern when sorting. Heap Sort for example is not as fast on average as QuickSort, but it's running time is almost always exactly the same, and it requires constant space.
Your algorithm is suited to when you want to take a randomly ordered vector and put the contents sorted into a list, without modifying the original vector.

Page 2 of 2 First 12