Quote:
Originally Posted by
thames
However what are the situations in which the array wouldn't be sorted?
Programming errors?
Verifying the functions work is a simple form of unit testing. When I fiddle with sort functions, I write a test harness: a main() that generates test cases, sorts them, usually also timing the execution, and verifies that the data is in sorted order.
I usually start with a test sorter, something like
Code:
static int compare_foo(const void *const ptr1, const void *const ptr2)
{
const foo val1 = *(const foo *)ptr1;
const foo val2 = *(const foo *)ptr2;
if (val1 < val2)
return -1;
else
if (val1 > val2)
return +1;
else
return 0;
}
void sort_foo(foo *const data, const size_t size)
{
qsort(data, size, sizeof (foo), compare_foo);
}
which I use to make sure the test harness works correctly, foo being the numeric data type I'm interested in at that point. (Just omit the qsort call to simulate a failing sort.)
That way you don't waste time looking for bugs all over your code, but can concentrate on doing each one thing well, and then immediately test.
Quote:
Originally Posted by
thames
is selection sort too slow for small arrays? and bubble sort ?
No, they are not.
It is true that insertion sort is more efficient than selection sort, and selection sort is faster than bubble sort. The links are to Wikipedia articles, with pseudocode example implementations.
Bubble sort has the very nice feature that it only swaps adjacent entries. It also does no extra work if the array is already sorted.
Selection sort is easy to program correctly, and uses at most as many swaps as there are elements in the array. It does do a large number of comparisons, however.
Insertion sort does fewer comparisons, and instead of swaps, moves parts of the array to be able to insert the value in the correct position.
In other words, the sort algorithms have different features. I personally think that alone makes them all worth learning, because it shows how you can use different approaches to crack a nut. That there is not just one vendor-provided nutcracker you need to use for everything, but have an entire toolbox to choose from:
Pick your tools wisely, with consideration; not with haste and conviction.
For the optimized quicksort case, where a faster-for-smaller-arrays sort function is called when the part to sort is small, insertion sort does beat selection and bubble sort. Because quicksort is designed for large arrays, a single quicksort() invocation will involve a lot of those calls. So, picking one that is efficient means a more efficient quicksort() overall.
Examples:
If you have a small array of relatively large structures, both bubble sort and insertion sort would mean you'd need to swap or move the data in those structures a lot. The amount of memory copied might be the bottleneck here. And you might have other restrictions why you don't want to modify the data structure or organization. With selection sort, you know you need very few (at most the number of elements in the array). So, you could use selection sort for that case.
If you have a small array of data you know is almost surely sorted, you could choose between insertion sort and bubble sort. Bubble sort uses swaps, whereas insertion sort uses insertion/memory moves. Bubble sort is slower, but since the array is small, the execution times you're comparing are "very short" and "even shorter": usually lost in the noise.
Of course, a good programmer will eventually want their solution to be robust, and avoid making unconfirmed assumptions. So, instead of assuming there are just a few elements of data, you'll eventually use a sort function that checks, and uses the appropriate algorithm for data. Even then, the most important thing, in my opinion, is to document the logic, the intentions you have as a programmer, in the source code, in descriptive comments: not what the code does, but why.
That way, when the purpose of the code changes, you or the future developer can check the code, see the assumptions made and their limitations, and adjust the code accordingly. Even things like cache architecture developments may change the balance of which algorithm turns out most efficient in practice, so knowing and documenting those reasons, is very useful in long term.