![]() |
| | #1 |
| Registered User Join Date: Dec 2008
Posts: 2
| I'm comparing Heapsort, Mergesort, and Quicksort, which operate on an array of randomly generated 16-bit shorts between -32768 to 32767 using srand() from <ctime> with seed time(0) (I chose short instead of int so that I could test the algorithms on large arrays while consuming half the memory that would be consumed by an array of ints). I've compiled these with g++. For array sizes up to 10 million, the times that I get are pretty much what one would expect, i.e. quicksort outperforms mergesort, which outperforms heapsort. But for more than 20 million elements, quicksort performs worse than mergesort, and for more than 50 million elements, quicksort performs worse than heapsort. This is contrary to my understanding that on average, quicksort is the best performing algorithm of the three. And since these numbers are randomly generated, we expect that quicksort's worst case is not encountered. Here is my Quicksort: Code:
int partition(short * nums, int lbound, int rbound)
{
int pivotIndex = rbound - 1;
while (lbound != pivotIndex)
{
if (nums[lbound] > nums[rbound])
{
short tmp = nums[lbound];
nums[lbound] = nums[pivotIndex];
nums[pivotIndex] = tmp;
pivotIndex--;
}
else
lbound++;
}
if (nums[pivotIndex] < nums[rbound])
pivotIndex++;
short tmp = nums[pivotIndex];
nums[pivotIndex] = nums[rbound];
nums[rbound] = tmp;
return pivotIndex;
}
void quickSort(short * nums, int lbound, int rbound)
{
int size = rbound - lbound + 1;
if (size <= 1)
return;
int pivotIndex = partition(nums, lbound, rbound);
quickSort(nums, lbound, pivotIndex - 1);
quickSort(nums, pivotIndex + 1, rbound);
}
void quickSort(short * nums, int size)
{
quickSort(nums, 0, size - 1);
}
For reference, here is what I have for Mergesort: Code:
void mergeSort(short * nums, int size)
{
if (size == 0 || size == 1)
return;
short * lnums;
short * rnums;
if (size <= STACKLIMIT_MERGE)
{
lnums = static_cast<short *>(alloca((size/2) * sizeof(short)));
rnums = static_cast<short *>(alloca((size - size/2) * sizeof(short)));
}
else
{
lnums = new short[size/2];
rnums = new short[size - size/2];
}
for (int i = 0; i < size/2; i++)
{
lnums[i] = nums[i];
rnums[i] = nums[size/2 + i];
}
if (size % 2 != 0)
rnums[size - size/2 - 1] = nums[size - 1];
mergeSort(lnums, size/2);
mergeSort(rnums, size - size/2);
int lindex = 0, rindex = 0;
for (int i = 0; i < size; i++)
{
if ((rindex != size - size/2) && (lindex == size/2 || rnums[rindex] < lnums[lindex]))
{
nums[i] = rnums[rindex];
rindex++;
}
else
{
nums[i] = lnums[lindex];
lindex++;
}
}
if (size > STACKLIMIT_MERGE)
{
delete [] lnums;
delete [] rnums;
}
}
Finally, my heapsort is as follows: Code:
void heapify(short * nums, int size) //max heap
{
int last = (int)(ceil((log((double)(size + 1)) / log((double)2))));
last = (int)(pow((float)2, last - 1) - 2);
for (int i = last; i >= 0; i--)
{
int j = i;
int k = ((2 * j + 2) < size) ?
((nums[2 * j + 1] > nums[2 * j + 2]) ? (2 * j + 1) : (2 * j + 2)) :
(((2 * j + 1) < size) ? (2 * j + 1) : -1);
if (k == -1)
continue;
while (nums[k] > nums[j])
{
short swp = nums[k];
nums[k] = nums[j];
nums[j] = swp;
j = k;
k = ((2 * k + 2) < size) ?
((nums[2 * k + 1] > nums[2 * k + 2]) ? (2 * k + 1) : (2 * k + 2)) :
(((2 * k + 1) < size) ? (2 * k + 1) : -1);
if (k == -1)
break;
}
}
}
void heapSort(short * nums, int size)
{
heapify(nums, size);
int index = size - 1;
while (index >= 0)
{
short root = nums[0];
nums[0] = nums[index--];
nums[index + 1] = root; //Critical step
if (index <= 0)
continue;
int j = 0;
int k = (index == 1) || (nums[1] > nums[2]) ? 1 : 2;
while (nums[k] > nums[j]) //rebuild the heap
{
short swp = nums[k];
nums[k] = nums[j];
nums[j] = swp;
j = k;
k = ((2 * k + 2) <= index) ?
((nums[2 * k + 1] > nums[2 * k + 2]) ? (2 * k + 1) : (2 * k + 2)) :
(((2 * k + 1) <= index) ? (2 * k + 1) : -1);
if (k == -1)
break;
}
}
}
Here are some numbers for reference: 1 million elements - H: 0.641s, M: 0.625s, Q: 0.360s 10 million elements - H: 10.610s, M: 7.172s, Q: 5.922s 20 million elements - H: 24.188s, M: 14.765s, Q: 16.875s 50 million elements - H: 72.812s, M: 38.250s, Q: 78.313s I'm running on Windows XP SP2, 2GB RAM, Intel x86 Centrino Duo (2 x 1.60 GHz), 120GB disk space (20GB free). I realize much of this info might be irrelevant. Any insight into my issue would be much appreciated. |
| hbejkosa is offline | |
| | #2 | |
| Registered User Join Date: Dec 2008
Posts: 71
| Quote:
Anyhow, I think you oversimplified the quicksort implementation a bit too much. Generally once the partition gets below some threshold, the typical thing to do is resort to introsort or any other O(n^2) algorithm. Using a median of 3 to select a pivot will also help quite a bit. I also think your heapsort implementation could use some work, specially when you impose heap properties on the array. As far as I know, most algorithms sift down the upper and lower parts separatedly (IE: You don't need to compute the logarithms) Here is a nice article regarding the 3 methods you are analyzing with source code, which will most likely give more insight than I can. Also note that most quicksort implementations use heapsort when the recursion depth gets too high (introsort), which faster than any of these 3 methods. | |
| Ronix is offline | |
| | #3 | |
| Algorithm Dissector Join Date: Dec 2005 Location: New Zealand
Posts: 2,475
| Quote:
The problem is that the partition step posted, is broken. When you find that your lbound item is greater than your rbound item you're swapping the lower bound one with the pivot. That's not a good method to use as the pivot value must not change during the partitioning process. You want to seek from the lbound end upwards until you find an item that is greater than the pivot, then search for an item from the rbound end downwards until you find one that it less than the pivot, and then swap the items, moving inwards one more, and continue until those indexes cross over. You need to actually copy the pivot item to somewhere and keep it there during the entire pivot process, such as a seperate variable or to the first position in the array etc. At the end the pivot value is put back in place. I have some rather well optimised implementations on my site (see below), but I don't have a particularly good description of how Quicksort partitioning works. Try here for that. Another good idea is that with your MergeSort, you may notice that the largest amount of extra memory allocated during the merge process is equal to the size of the original array. You improve things a lot by simply allocating one extra array of the same size as the original, up front, and just use this as your one and only temp buffer the whole way through. Many allocations and deallocations equals slow, no matter how you do it. Your HeapSort looks a bit nasty. HeapSort should definitely never use floating point math. Also, your calculations for k are extremely hard to read. The inner while loops are the same in both functions. That's a lot of duplicated code that should be made into a function. If you move the outer loop in heapify back into the main function, then you can simply call heapify again - much shorter!
__________________ My homepage Advice: Take only as directed - If symptoms persist, please see your debugger | |
| iMalc is offline | |
| | #4 |
| Registered User Join Date: Dec 2008
Posts: 2
| Thanks man! The partition algorithm in your link solved my scalability problems. Yes, my apologies, my heapsort code is poorly commented. The idea behind the k calculations is that (in a heap context) I want to find the index of the larger child node of the j index if it exists (hence the 2j + 1 and 2j + 2). The idea behind the floating point math is to find the index that is the end of the last full level in the heap, so that we don't have to waste time with the last level. I imagine that this is still not optimal. The reason I'm writing these, by the way, is for a school project, so all I really need is steady implementations across the board. I hope you don't mind if I consult your code. |
| hbejkosa is offline | |
![]() |
| Tags |
| algorithm, problem, quicksort, sort, sorting |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| pointer problems | BradC78 | C Programming | 9 | 02-19-2009 12:14 AM |
| No clue how to make a code to solve problems! | ctnzn | C Programming | 8 | 10-16-2008 02:59 AM |
| Problems with Quicksort and Pointers | algatt | C Programming | 3 | 10-10-2007 04:24 AM |
| Problems implementing quicksort function | rvanbeusichem | C Programming | 3 | 11-24-2004 09:00 PM |
| Quicksort | Nutshell | C Programming | 1 | 01-15-2002 09:42 AM |