1. ## Sorting list performances

Hi all,

I have a list that contain about (200000 elements) and i want to keep them in a sorting order, so i do some testing with insertion sort and merge sort but it seems that insertion sort is better!! is it normal? how can i improve time and CPU performances? Thanks for your help.

My node (for simplicity, i omitted the prev node in the node struct):

Code:
```struct node {
int number;
struct node *next;
};```
Insertion sort :

Code:
```void insert_node(int value) {
struct node *new_node = NULL;
struct node *cur_node = NULL;
struct node *last_node = NULL;
int found; /* 1 means found a place to insert the new node in, 0 means not*/

new_node = (struct node *)malloc(sizeof(struct node *));
if(new_node == NULL) {
printf("memory problem\n");
}
new_node->number = value;
/* If the first element */
new_node->next = NULL;
}

else if (new_node->number < head->number) {
}

else {
found = 0;
while (( cur_node != NULL ) && ( found == 0 )) {
if( new_node->number < cur_node->number )
{
found = 1;
}
else
{
last_node = cur_node;
cur_node = cur_node->next;
}
}
/* We got the right place to insert our node */
if( found == 1 )
{
new_node->next = cur_node;
}
/* Insert at the tail of the list */
else
{
last_node->next = new_node;
new_node->next = NULL;
}
}```
Merge Sort :

Code:
```/* add a node to the linked list */
struct node *addnode(int number, struct node *next) {
struct node *tnode;

tnode = (struct node*)malloc(sizeof(*tnode));

if(tnode != NULL) {
tnode->number = number;
tnode->next = next;
}

return tnode;
}

/* preform merge sort on the linked list */
struct node *merge_sort(struct node *head) {

}

}

/* merge the lists.. */

} else {
}

}```

2. A proper merge sort does not go "all the way down". It's usual to use a sort with less overhead to sort the lists when they become "small" (small is a relative term, but maybe 10 or maybe even 100 ... other people here know more about that kind of thing).

Also, it would be nice if you could post some runnable code (complete with timing code).

3. Well, both has a complexity of O(n log n).
And (afaik), you can't do sorting much better than that, without knowing what is going to be in the list.
But I read that merge sort is supposed to be a better choice for sorting lists.
Maybe some other factors are coming into play here ? (Don't know what..)

> how can i improve time and CPU performances?
Threading merge sort can be a good thing to do. (And since you have a list, you may not worry about some pitfalls like in arrays).

4. Originally Posted by oogabooga
A proper merge sort does not go "all the way down". It's usual to use a sort with less overhead to sort the lists when they become "small" (small is a relative term, but maybe 10 or maybe even 100 ... other people here know more about that kind of thing).

Also, it would be nice if you could post some runnable code (complete with timing code).
Well you can test with this (for the merge sort) :

Code:
```int main(void) {
struct node *current;
struct node *next;
int i;
struct timeval tbegin,tend;
double texec=0.;

// Start timer
gettimeofday(&tbegin,NULL);

/* insert some numbers into the list */
for (i = 100000; i > 0; i--)

/* sort the list */

/* print the list */
printf(" before after\n"), i = 0;
for(current = head; current != NULL; current = current->next)
printf("%4d\t%4d\n", i++, current->number);

/* free the list */
for(current = head; current != NULL; current = next)
next = current->next, free(current);

// End timer
gettimeofday(&tend,NULL);

// Compute execution time
texec=((double)(1000*(tend.tv_sec-tbegin.tv_sec)+((tend.tv_usec-tbegin.tv_usec)/1000)))/1000.;

printf("Elapsed time = %f\n", texec);

/* done... */
return 0;
}```
And for the insert sort something like this for example :

Code:
```int main(void) {
struct node *current = NULL;
struct node *next = NULL;
int i = 0;

struct timeval tbegin,tend;
double texec=0.;

// Start timer
gettimeofday(&tbegin,NULL);

/* insert some numbers into the list */
for(i = 10000; i > 0; i--)
insert_node(i);

/* print the list */
printf(" before  after\n"), i = 10000;
}

/* free the list */
for(current = head; current != NULL; current = next)
next = current->next, free(current);
// End timer
gettimeofday(&tend,NULL);

// Compute execution time
texec=((double)(1000*(tend.tv_sec-tbegin.tv_sec)+((tend.tv_usec-tbegin.tv_usec)/1000)))/1000.;

printf("Elapsed time = %f\n", texec);

return 0;
}```

5. Originally Posted by manasij7479
Well, both has a complexity of O(n log n).
And (afaik), you can't do sorting much better than that, without knowing what is going to be in the list.
But I read that merge sort is supposed to be a better choice for sorting lists.
Maybe some other factors are coming into play here ? (Don't know what..)

> how can i improve time and CPU performances?
Threading merge sort can be a good thing to do. (And since you have a list, you may not worry about some pitfalls like in arrays).
Thanks for the reply but i don't get the same performances with the two and i was expecting that merge sort is about o(n * log(n)) and is much better than insert sort which is o(n2).

How can i use threads in my case, a sample code will be great thanks .

Regards.

6. Originally Posted by evariste
Thanks for the reply but i don't get the same performances with the two and i was expecting that merge sort is about o(n * log(n)) and is much better than insert sort which is o(n2).
Insertion sort is O(n^2) only when each insertion is linear.
That is generally not the case in lists. (I'm not very sure about this point, especially in this context)
How can i use threads in my case, a sample code will be great thanks .
Well, just like in any divide and conquer algorithm, divide and you get different works that may be run at the same time.
I tried it a few days ago. (Though in C++, but you can easily convert the threading code into pthreads or similar)
Threaded Merge Sort: No performance gain. :(
Especially look at iMalc's code in #9 .
(I've still got to much work to do on that front )

7. Thanks manasij7479, i will try to do somme threading on merge sort and see if it's better.

8. Insertion sort is very good for data that is almost in sorted order - or has a very small number of objects to be sorted. I don't know if it's the very fastest for a small number of objects, but it certainly is faster than Merge sort or Quicksort, in my tests.

I use it with Quicksort when the sub-arrays are small (70 or less has proven most effective on an i7 system), and partially sorted already (on integers). It gives Quicksort a nice boost. Since it is a stable sort as well, Insertion sort is more effective than it might appear, at first.

9. The initial problem I see with the insertion sort test is that it is always inserting a number that is smaller than the head of the list, so your test (if new_node->number < head->number) always passes, and you never iterate over the list. Remember, insertion sort is O(n**2) on average (and worst-case), but O(n) in the best case, which you have.

10. Thanks Adak, so you think that quicksort maybe a better choice in my case?

11. Thanks cas, you're right i will try a better example for testing the worst case in insertion sort.

12. After a quick perusal, I see that your merge sort is improperly implemented. Check both head_one and head_two in merge_sort(), after you split the list. Each should contain half of the data. Instead, head_one is almost the entire list.

13. Originally Posted by evariste
Thanks Adak, so you think that quicksort maybe a better choice in my case?
No! Quicksort is lousy for lists. Just mentioned Quicksort, because it was the algorithm that I had tested with Insertion sort, and found it to significantly improve it. I believe that Insertion sort would also be useful to your list sorting, depending on the data and merge sort algorithm you use.

The first thing though, is to get your Merge sort code, up to snuff. (see above post).

14. Thanks Adak, cas, i fixed the problem with my merge sort algorithm, the new one is below :

Merge sort algorithm:

Code:
```/* insert a node */
struct node *insert_node(int number, struct node *next) {
struct node *new_node;

new_node = (struct node*)malloc(sizeof(*new_node));

if(new_node != NULL) {
new_node->number = number;
new_node->next = next;
}

return new_node;
}

/* sorting the list */
{
struct node* first_half;
struct node* second_half;

{
return;
}

/* divide list into two sublists */

/* Recursively sort */
merge_sort(&first_half);
merge_sort(&second_half);

}

/* Merge our lists to get one sorted list */
struct node* merge_sorted_lists(struct node* first_half, struct node* second_half)
{
struct node* result = NULL;

if (first_half == NULL)
return(second_half);
else if (second_half==NULL)
return(first_half);

if (first_half->number <= second_half->number)
{
result = first_half;
result->next = merge_sorted_lists(first_half->next, second_half);
}
else
{
result = second_half;
result->next = merge_sorted_lists(first_half, second_half->next);
}
return(result);
}

/* Divide the list into two lists */
void list_divide(struct node* original_list,
struct node** first_half, struct node** second_half)
{
struct node* temp_one;
struct node* temp_two;

if (original_list ==NULL || original_list->next==NULL)
{
*first_half = original_list;
*second_half = NULL;
}
else
{
temp_two = original_list;
temp_one = original_list->next;

while (temp_one != NULL)
{
temp_one = temp_one->next;
if (temp_one != NULL)
{
temp_two = temp_two->next;
temp_one = temp_one->next;
}
}

*first_half = original_list;
*second_half = temp_two->next;
temp_two->next = NULL;
}
}```
And now i can see that it's very fast and in my tests i got the same time of execution as with insertion sort. Any optimizations are welcome .

Regards.

15. MergeSort is certainly very good for lists, better than quicksort really since that cant tend towards O(n^2).
However the killer is BucketSort. Merge Sort is a long way ahead of most other algorithms for linked-list sorting and then BucketSort is light-years ahead of that.
The catch of course is that your key needs to support being split into buckets, as well as just being directly comparable. In your case you are using ints for which it is fine.