# Quicksort help!

Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last
• 02-17-2012
zeronero
Quicksort help!
Hi all,

I have written a quicksort algo that works fine on a array of ints. My problem however is that I will be sorting a few millions structs based on its int member.

My task has to split the few millions structs into sections of 750. So I will have 750 structs then list[1] will be another 750 structs and then list[2] will have another 750 all the way to a few million. So my question is how can I sort the array when it is structured in this seperated way, or can I just create the structs in one large array say struct MyList mine[4000000] and then input the sorted data into the seperated sections.

I hope this is clear, thanks.

A bit more explanation, I would have a structure like:

list[10]
section[6]
struct obj->int value to sort by

So in this case there would be 10 entries of 6 structs. So 60 obj structs all together that need to be sorted by their int value. However in real scale there would be maybe 3-4 million obj structs, so I dont think I can do struct obj list[3000000] and sort that, I would have to sort it in its seperated form.
• 02-17-2012
quzah
Ok, how do you actually want it sorted? Use this example:

1 2 3 4 5 6 7 8

You have eight numbers, in random order. You have 2 buckets (not 750). Now how do they end up?

1 2 3 4
5 6 7 1

1 3 5 7
2 4 6 8

Or what?

How do you decide which ones go in what bucket? Do they all arrive to you at the same time? Do you have all the data at once? Does it trickle in, or what? You need to think about all of that, before you actually worry about writing the sorting code.

Quzah.
• 02-17-2012
zeronero
Quote:

Originally Posted by quzah
Ok, how do you actually want it sorted? Use this example:

1 2 3 4 5 6 7 8

You have eight numbers, in random order. You have 2 buckets (not 750). Now how do they end up?

1 2 3 4
5 6 7 1

1 3 5 7
2 4 6 8

Or what?

How do you decide which ones go in what bucket? Do they all arrive to you at the same time? Do you have all the data at once? Does it trickle in, or what? You need to think about all of that, before you actually worry about writing the sorting code.

Quzah.

Hi, I will read the numbers from a file and load them all into the structure, they will be in total random order so I want to sort them sequentially so it would become:

1234 - bucket 1
5678 - bucket 2
etc...

At the beginning it will be random so:

5634 - bucket 1
2187 - bucket 2
• 02-17-2012
nonoob
The question is whether the multiple separate lists are alright being sorted each on their own independently, or whether you want all 3000000 structs in order. You haven't stated that clearly.
• 02-17-2012
zeronero
Quote:

Originally Posted by nonoob
The question is whether the multiple separate lists are alright being sorted each on their own independently, or whether you want all 3000000 structs in order. You haven't stated that clearly.

Hi Yes I would like all 3000000 structs in order.
• 02-17-2012
zeronero
Like at the moment I am trying to get it to work with an array of 3 buckets of 4 objects. I am getting lost in it thought it doesn't seem to be picking the right elements to swap at all!
• 02-17-2012
quzah
Code:

```#define BUCKETS 750 #define BINS 4000 #define INVALIDBINORBUCKET (BUCKETS*BINS) #define WHATBUCKET(x) ((x)/BINS) #define WHATBIN(x) ((x)%BINS) ... type put( type**data, type object, type where ) {     if( where > 0 && where < INVALIDBINORBUCKET )     {         type bin = WHATBIN( where );         type bucket = WHATBUCKET( where );         data[ bucket ][ bin ] = object;         return 0;     }     return INVALIDBINORBUCKET; }```
This assumes you're making a dynamic 2D array of objects. You could have just as easily went 1D.

You may want to place things where you think they should be as you read them in. Take some value, decide where that value should roughly fall, and stick it in that bin/bucket. Then later run through and sort the whole thing (I guess).

Quzah.
• 02-17-2012
zeronero
Quote:

Originally Posted by quzah
Code:

```#define BUCKETS 750 #define BINS 4000 #define INVALIDBINORBUCKET (BUCKETS*BINS) #define WHATBUCKET(x) ((x)/BINS) #define WHATBIN(x) ((x)%BINS) ... type put( type**data, type object, type where ) {     if( where > 0 && where < INVALIDBINORBUCKET )     {         type bin = WHATBIN( where );         type bucket = WHATBUCKET( where );         data[ bucket ][ bin ] = object;         return 0;     }     return INVALIDBINORBUCKET; }```
This assumes you're making a dynamic 2D array of objects. You could have just as easily went 1D.

You may want to place things where you think they should be as you read them in. Take some value, decide where that value should roughly fall, and stick it in that bin/bucket. Then later run through and sort the whole thing (I guess).

Quzah.

Hi Quzah,

Thanks for the response. Your code does not do any sorting though, I am already at the stage where I have all the data in the buckets but it is in random order. So I know need to sort it, normally in quicksort I just keep track of the left and right markers but the fact there are buckets and bins makes this a lot harder which is the bit I cannot figure out?
• 02-17-2012
oogabooga
Why are you splitting your input into buckets?

Does the bucketing have to be done before the sort (if so, why)?

Do you have to use quicksort?

What is the range of the int you're sorting on?

• 02-17-2012
nonoob
If you want to sort each bucket separately then go ahead. If you want the entire collection of 3000000 entries sorted you'll need to merge the lists. Seems simple.
• 02-17-2012
iMalc
Okay so you have an array of linked-lists (all of which happen to be the same length) and you want to sort is such that of all the n elements, the k lowest are in the first list and the next k lowest are in the next list etc. Clearly items do not necessarily end up in the bucket they started in.

What this all says to me is that you should do this via the following three steps:
Step 1: Join all of the linked-lists together
Step 2: Use a linked-list implementation of a sorting algorithm
Step 3: Cut the list up into lists of the original size and store each head back into the original array.

So, I hate to break it to you, but your array-based implementation of quicksort is not going to be of much use to you here.

It looks to me like the only reason for having the array of lists at all as opposed to keeping it as one list afterwards, is that you can perform lookups faster on this structure by doing a binary search based on the head items in the array. This gives you O(log(n/k) + k) lookup time, where k is the list length and n is the total number of items. I would just pick a better data structure than a list to begin with.
• 02-17-2012
zeronero
Quote:

Originally Posted by iMalc
Okay so you have an array of linked-lists (all of which happen to be the same length) and you want to sort is such that of all the n elements, the k lowest are in the first list and the next k lowest are in the next list etc. Clearly items do not necessarily end up in the bucket they started in.

What this all says to me is that you should do this via the following three steps:
Step 1: Join all of the linked-lists together
Step 2: Use a linked-list implementation of a sorting algorithm
Step 3: Cut the list up into lists of the original size and store each head back into the original array.

So, I hate to break it to you, but your array-based implementation of quicksort is not going to be of much use to you here.

It looks to me like the only reason for having the array of lists at all as opposed to keeping it as one list afterwards, is that you can perform lookups faster on this structure by doing a binary search based on the head items in the array. This gives you O(log(n/k) + k) lookup time, where k is the list length and n is the total number of items. I would just pick a better data structure than a list to begin with.

Hi you are right, I am doing it this way so I can search the structure quickly once it is all ordered.
• 02-17-2012
zeronero
Quote:

Originally Posted by oogabooga
Why are you splitting your input into buckets?

Does the bucketing have to be done before the sort (if so, why)?

Do you have to use quicksort?

What is the range of the int you're sorting on?

No it doesn't have to be done before the bucketing and infact seem easier to do before! but how can I do that if i have millions of structs as struct obj arrayname[3000000] just seg faults when I run it?
• 02-17-2012
oogabooga
You didn't say how big your struct is.

If your problem is that you're declaring the array locally and it won't fit on the stack, you could try declaring it globally.
• 02-17-2012
iMalc
Quote:

Originally Posted by zeronero
No it doesn't have to be done before the bucketing and infact seem easier to do before! but how can I do that if i have millions of structs as struct obj arrayname[3000000] just seg faults when I run it?

If you are doing what you had stated earlier, then you have no need for such a large array. Why do you now want a large array instead of a smaller array of linked-lists?

Or rather, once you make your mind up about how you plan to store the data, please ask a specific question on what part you are having trouble with. It's harder to help a moving target.
Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last