I have written a quicksort algo that works fine on a array of ints. My problem however is that I will be sorting a few millions structs based on its int member.
My task has to split the few millions structs into sections of 750. So I will have 750 structs then list will be another 750 structs and then list will have another 750 all the way to a few million. So my question is how can I sort the array when it is structured in this seperated way, or can I just create the structs in one large array say struct MyList mine and then input the sorted data into the seperated sections.
I hope this is clear, thanks.
A bit more explanation, I would have a structure like:
struct obj->int value to sort by
So in this case there would be 10 entries of 6 structs. So 60 obj structs all together that need to be sorted by their int value. However in real scale there would be maybe 3-4 million obj structs, so I dont think I can do struct obj list and sort that, I would have to sort it in its seperated form.
Ok, how do you actually want it sorted? Use this example:
1 2 3 4 5 6 7 8
You have eight numbers, in random order. You have 2 buckets (not 750). Now how do they end up?
1 2 3 4
5 6 7 1
1 3 5 7
2 4 6 8
How do you decide which ones go in what bucket? Do they all arrive to you at the same time? Do you have all the data at once? Does it trickle in, or what? You need to think about all of that, before you actually worry about writing the sorting code.
Hi, I will read the numbers from a file and load them all into the structure, they will be in total random order so I want to sort them sequentially so it would become:
Originally Posted by quzah
1234 - bucket 1
5678 - bucket 2
At the beginning it will be random so:
5634 - bucket 1
2187 - bucket 2
The question is whether the multiple separate lists are alright being sorted each on their own independently, or whether you want all 3000000 structs in order. You haven't stated that clearly.
Hi Yes I would like all 3000000 structs in order.
Originally Posted by nonoob
Like at the moment I am trying to get it to work with an array of 3 buckets of 4 objects. I am getting lost in it thought it doesn't seem to be picking the right elements to swap at all!
This assumes you're making a dynamic 2D array of objects. You could have just as easily went 1D.
#define BUCKETS 750
#define BINS 4000
#define INVALIDBINORBUCKET (BUCKETS*BINS)
#define WHATBUCKET(x) ((x)/BINS)
#define WHATBIN(x) ((x)%BINS)
type put( type**data, type object, type where )
if( where > 0 && where < INVALIDBINORBUCKET )
type bin = WHATBIN( where );
type bucket = WHATBUCKET( where );
data[ bucket ][ bin ] = object;
You may want to place things where you think they should be as you read them in. Take some value, decide where that value should roughly fall, and stick it in that bin/bucket. Then later run through and sort the whole thing (I guess).
Originally Posted by quzah
Thanks for the response. Your code does not do any sorting though, I am already at the stage where I have all the data in the buckets but it is in random order. So I know need to sort it, normally in quicksort I just keep track of the left and right markers but the fact there are buckets and bins makes this a lot harder which is the bit I cannot figure out?
Why are you splitting your input into buckets?
Does the bucketing have to be done before the sort (if so, why)?
Do you have to use quicksort?
What is the range of the int you're sorting on?
How big is your struct?
If you want to sort each bucket separately then go ahead. If you want the entire collection of 3000000 entries sorted you'll need to merge the lists. Seems simple.
Okay so you have an array of linked-lists (all of which happen to be the same length) and you want to sort is such that of all the n elements, the k lowest are in the first list and the next k lowest are in the next list etc. Clearly items do not necessarily end up in the bucket they started in.
What this all says to me is that you should do this via the following three steps:
Step 1: Join all of the linked-lists together
Step 2: Use a linked-list implementation of a sorting algorithm
Step 3: Cut the list up into lists of the original size and store each head back into the original array.
So, I hate to break it to you, but your array-based implementation of quicksort is not going to be of much use to you here.
It looks to me like the only reason for having the array of lists at all as opposed to keeping it as one list afterwards, is that you can perform lookups faster on this structure by doing a binary search based on the head items in the array. This gives you O(log(n/k) + k) lookup time, where k is the list length and n is the total number of items. I would just pick a better data structure than a list to begin with.
Hi you are right, I am doing it this way so I can search the structure quickly once it is all ordered.
Originally Posted by iMalc
No it doesn't have to be done before the bucketing and infact seem easier to do before! but how can I do that if i have millions of structs as struct obj arrayname just seg faults when I run it?
Originally Posted by oogabooga
You didn't say how big your struct is.
If your problem is that you're declaring the array locally and it won't fit on the stack, you could try declaring it globally.
If you are doing what you had stated earlier, then you have no need for such a large array. Why do you now want a large array instead of a smaller array of linked-lists?
Originally Posted by zeronero
Or rather, once you make your mind up about how you plan to store the data, please ask a specific question on what part you are having trouble with. It's harder to help a moving target.