# Which sort algorithm would U use to sort an array of million integers between 0 & 99

• 05-30-2012
Ben Frankliner
Which sort algorithm would U use to sort an array of million integers between 0 & 99
I'm leaning towards Timsort since it kicks ash in Python

What are your thoughts? Pros & Cons?
• 05-30-2012
laserlight
The constraints make it sound like counting sort is the way to go.
• 05-30-2012
Neo1
Definitely counting sort, runs in O(n + m), where n is a million and m is 99 in your case. The con is that it also uses up O(n + m) space, which is quite alot when n = 1.000.000, but it will run in linear time.

If you use Quicksort you can get away with O(log n) space usage, but then you would have O(n log n) running time on average.

There is no perfect algorithm for the job. Which is most constrained in your case? Space or speed?
• 05-30-2012
brewbuck
Quote:

Originally Posted by Neo1
Definitely counting sort, runs in O(n + m), where n is a million and m is 99 in your case. The con is that it also uses up O(n + m) space, which is quite alot when n = 1.000.000, but it will run in linear time.

If you use Quicksort you can get away with O(log n) space usage, but then you would have O(n log n) running time on average.

There is no perfect algorithm for the job. Which is most constrained in your case? Space or speed?

I'm not following the O(n + m) space requirement. Surely it is O(m), because you just need m counters. Or are you including the size of the output in the space requirement (which IMHO it should not be)
• 05-30-2012
Neo1
Quote:

Originally Posted by brewbuck
I'm not following the O(n + m) space requirement. Surely it is O(m), because you just need m counters. Or are you including the size of the output in the space requirement (which IMHO it should not be)

I'm including the size of the output because counting sort cannot be performed in-place, so the OP would need an extra array of size 1.000.000 for the output, whereas with various other sorting algorithms you wouldn't need an output array at all.

Wikipedia seems to have a different opinion than me though :-) It seems there is a version of counting sort that can be performed in-place, however it won't be stable then. In this case the space requirements would be just O(m). I didn't know of this special version of counting sort when i first replied.
• 05-30-2012
phantomotap
O_o

The space requirement for the generic "Counting Sort" is 'O(n + m)'. Period. The output array is memory that is required by the algorithm. Pretending that memory doesn't count because "it is an output" is wrong. Pure and simple. If I tell you that an algorithm requires 'O(m)' extra space when it requires 'O(n + m)' I'm lying. You have to account for that space because choosing between two algorithms may come down to space requirements and some algorithms do only use 'O(m)' space.

A dozen or so special purpose variations of "Counting Sort" exist. They each have different "trade-offs". You can't properly choose between these variations if you start excluding the details.

Soma
• 05-30-2012
stahta01
I would do it using a 100 element array of integers.

Note: The solution is NOT really a sorting algorithm; but, the output is based on the input.

It is more like an data compression and sort algorithm combined.

Tim S.
• 05-30-2012
manasij7479
Quote:

Originally Posted by stahta01
I would do it using a 100 element array of integers.

Note: The solution is NOT really a sorting algorithm; but, the output is based on the input.

It is more like an data compression and sort algorithm combined.

Tim S.

+1
It'd work great here.
I learnt this technique when thinking about an exercise in "Programming Pearls".
• 05-30-2012
phantomotap
Quote:

It is more like an data compression and sort algorithm combined.
If you are curious, this is an instance of a "distribution sort".

It is conceptually the "first half" of "Counting Sort" optimized for integers values as keys.

Soma
• 05-30-2012
brewbuck
Choosing between an in-place and an out-of-place sort does not require an ivory tower analysis.
• 05-31-2012
iMalc
Quote:

Originally Posted by stahta01
I would do it using a 100 element array of integers.

Note: The solution is NOT really a sorting algorithm; but, the output is based on the input.

It is more like an data compression and sort algorithm combined.

Tim S.

Precisely.
And just to be clear, the array of 100 counts allows you to reconstruct the sorted data over top of the original data, should you not need to retain the original data.

If you really really needed maximum speed, then a counting sort with parameters such as those that are proposed here would lead itself to parallelism fairly well. Say four threads each processing a quarter of the input, and four arrays of 100 items that can be added together prior to data reconstruction, would kick ass.

Of course having such perfect parameters such as the stated problem practically never happens in practice.