# Sorting list performances

Printable View

Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last
• 02-18-2012
evariste
Sorting list performances
Hi all,

I have a list that contain about (200000 elements) and i want to keep them in a sorting order, so i do some testing with insertion sort and merge sort but it seems that insertion sort is better!! is it normal? how can i improve time and CPU performances? Thanks for your help.

My node (for simplicity, i omitted the prev node in the node struct):

Code:

```struct node {     int number;     struct node *next; };```
Insertion sort :

Code:

```void insert_node(int value) {     struct node *new_node = NULL;     struct node *cur_node = NULL;     struct node *last_node = NULL;     int found; /* 1 means found a place to insert the new node in, 0 means not*/     new_node = (struct node *)malloc(sizeof(struct node *));     if(new_node == NULL) {         printf("memory problem\n");     }     new_node->number = value;     /* If the first element */     if (head == NULL) {         new_node->next = NULL;         head = new_node;     }     else if (new_node->number < head->number) {         new_node->next = head;         head = new_node;        }     else {         cur_node = head;         found = 0;         while (( cur_node != NULL ) && ( found == 0 )) {             if( new_node->number < cur_node->number )             {                 found = 1;             }             else             {                 last_node = cur_node;                 cur_node = cur_node->next;             }         }     /* We got the right place to insert our node */     if( found == 1 )     {         new_node->next = cur_node;     }     /* Insert at the tail of the list */     else     {         last_node->next = new_node;         new_node->next = NULL;     }          }```
Merge Sort :

Code:

```/* add a node to the linked list */ struct node *addnode(int number, struct node *next) {     struct node *tnode;     tnode = (struct node*)malloc(sizeof(*tnode));     if(tnode != NULL) {         tnode->number = number;         tnode->next = next;     }     return tnode; } /* preform merge sort on the linked list */ struct node *merge_sort(struct node *head) {     struct node *head_one;     struct node *head_two;     if((head == NULL) || (head->next == NULL))         return head;     head_one = head;     head_two = head->next;     while((head_two != NULL) && (head_two->next != NULL)) {         head = head->next;         head_two = head->next->next;     }     head_two = head->next;     head->next = NULL;     return merge(merge_sort(head_one), merge_sort(head_two)); } /* merge the lists.. */ struct node *merge(struct node *head_one, struct node *head_two) {     struct node *head_three;     if(head_one == NULL)         return head_two;     if(head_two == NULL)         return head_one;     if(head_one->number < head_two->number) {         head_three = head_one;         head_three->next = merge(head_one->next, head_two);     } else {         head_three = head_two;         head_three->next = merge(head_one, head_two->next);     }     return head_three; }```
• 02-18-2012
oogabooga
A proper merge sort does not go "all the way down". It's usual to use a sort with less overhead to sort the lists when they become "small" (small is a relative term, but maybe 10 or maybe even 100 ... other people here know more about that kind of thing).

Also, it would be nice if you could post some runnable code (complete with timing code).
• 02-18-2012
manasij7479
Well, both has a complexity of O(n log n).
And (afaik), you can't do sorting much better than that, without knowing what is going to be in the list.
But I read that merge sort is supposed to be a better choice for sorting lists.
Maybe some other factors are coming into play here ? (Don't know what..)

> how can i improve time and CPU performances?
Threading merge sort can be a good thing to do. (And since you have a list, you may not worry about some pitfalls like in arrays).
• 02-18-2012
evariste
Quote:

Originally Posted by oogabooga
A proper merge sort does not go "all the way down". It's usual to use a sort with less overhead to sort the lists when they become "small" (small is a relative term, but maybe 10 or maybe even 100 ... other people here know more about that kind of thing).

Also, it would be nice if you could post some runnable code (complete with timing code).

Well you can test with this (for the merge sort) :

Code:

```int main(void) {         struct node *head;         struct node *current;         struct node *next;         int i;         struct timeval tbegin,tend;         double texec=0.;           // Start timer         gettimeofday(&tbegin,NULL);           head = NULL;         /* insert some numbers into the list */         for (i = 100000; i > 0; i--)                 head = addnode(i, head);           /* sort the list */         head = merge_sort(head);           /* print the list */         printf(" before after\n"), i = 0;         for(current = head; current != NULL; current = current->next)                 printf("%4d\t%4d\n", i++, current->number);           /* free the list */         for(current = head; current != NULL; current = next)                 next = current->next, free(current);         // End timer         gettimeofday(&tend,NULL);         // Compute execution time       texec=((double)(1000*(tend.tv_sec-tbegin.tv_sec)+((tend.tv_usec-tbegin.tv_usec)/1000)))/1000.;           printf("Elapsed time = %f\n", texec);           /* done... */         return 0; }```
And for the insert sort something like this for example :

struct node *head = NULL;

Code:

```int main(void) {         struct node *current = NULL;         struct node *next = NULL;         int i = 0;         struct timeval tbegin,tend;         double texec=0.;           // Start timer         gettimeofday(&tbegin,NULL);         /* insert some numbers into the list */         for(i = 10000; i > 0; i--)                 insert_node(i);         /* print the list */         printf(" before  after\n"), i = 10000;         while(head->next != NULL) {                   printf("%4d\t%4d\n", i--, head->number);                   head = head->next;         }         /* free the list */         for(current = head; current != NULL; current = next)                   next = current->next, free(current);         // End timer         gettimeofday(&tend,NULL);           // Compute execution time         texec=((double)(1000*(tend.tv_sec-tbegin.tv_sec)+((tend.tv_usec-tbegin.tv_usec)/1000)))/1000.;           printf("Elapsed time = %f\n", texec);         return 0; }```
• 02-18-2012
evariste
Quote:

Originally Posted by manasij7479
Well, both has a complexity of O(n log n).
And (afaik), you can't do sorting much better than that, without knowing what is going to be in the list.
But I read that merge sort is supposed to be a better choice for sorting lists.
Maybe some other factors are coming into play here ? (Don't know what..)

> how can i improve time and CPU performances?
Threading merge sort can be a good thing to do. (And since you have a list, you may not worry about some pitfalls like in arrays).

Thanks for the reply but i don't get the same performances with the two and i was expecting that merge sort is about o(n * log(n)) and is much better than insert sort which is o(n2).

How can i use threads in my case, a sample code will be great thanks :).

Regards.
• 02-18-2012
manasij7479
Quote:

Originally Posted by evariste
Thanks for the reply but i don't get the same performances with the two and i was expecting that merge sort is about o(n * log(n)) and is much better than insert sort which is o(n2).

Insertion sort is O(n^2) only when each insertion is linear.
That is generally not the case in lists. (I'm not very sure about this point, especially in this context)
Quote:

How can i use threads in my case, a sample code will be great thanks :).
Well, just like in any divide and conquer algorithm, divide and you get different works that may be run at the same time.
I tried it a few days ago. (Though in C++, but you can easily convert the threading code into pthreads or similar)
http://cboard.cprogramming.com/cplus...ance-gain.html
Especially look at iMalc's code in #9 .
(I've still got to much work to do on that front )
• 02-18-2012
evariste
Thanks manasij7479, i will try to do somme threading on merge sort and see if it's better.
• 02-18-2012
Adak
Insertion sort is very good for data that is almost in sorted order - or has a very small number of objects to be sorted. I don't know if it's the very fastest for a small number of objects, but it certainly is faster than Merge sort or Quicksort, in my tests.

I use it with Quicksort when the sub-arrays are small (70 or less has proven most effective on an i7 system), and partially sorted already (on integers). It gives Quicksort a nice boost. Since it is a stable sort as well, Insertion sort is more effective than it might appear, at first.

Malcolm will certainly want to reply to your post! XD
• 02-18-2012
cas
The initial problem I see with the insertion sort test is that it is always inserting a number that is smaller than the head of the list, so your test (if new_node->number < head->number) always passes, and you never iterate over the list. Remember, insertion sort is O(n**2) on average (and worst-case), but O(n) in the best case, which you have.
• 02-18-2012
evariste
Thanks Adak, so you think that quicksort maybe a better choice in my case?
• 02-18-2012
evariste
Thanks cas, you're right i will try a better example for testing the worst case in insertion sort.
• 02-18-2012
cas
After a quick perusal, I see that your merge sort is improperly implemented. Check both head_one and head_two in merge_sort(), after you split the list. Each should contain half of the data. Instead, head_one is almost the entire list.
• 02-18-2012
Adak
Quote:

Originally Posted by evariste
Thanks Adak, so you think that quicksort maybe a better choice in my case?

No! Quicksort is lousy for lists. Just mentioned Quicksort, because it was the algorithm that I had tested with Insertion sort, and found it to significantly improve it. I believe that Insertion sort would also be useful to your list sorting, depending on the data and merge sort algorithm you use.

The first thing though, is to get your Merge sort code, up to snuff. (see above post).
• 02-19-2012
evariste
Thanks Adak, cas, i fixed the problem with my merge sort algorithm, the new one is below :

Merge sort algorithm:

Code:

```/* insert a node */ struct node *insert_node(int number, struct node *next) {         struct node *new_node;           new_node = (struct node*)malloc(sizeof(*new_node));           if(new_node != NULL) {                 new_node->number = number;                 new_node->next = next;         }           return new_node; }   /* sorting the list */ void merge_sort(struct node** list_head) {   struct node* head = *list_head;   struct node* first_half;   struct node* second_half;     if ((head == NULL) || (head->next == NULL))   {     return;   }     /* divide list into two sublists */   list_divide(head, &first_half, &second_half);     /* Recursively sort */   merge_sort(&first_half);   merge_sort(&second_half);     *list_head = merge_sorted_lists(first_half, second_half); } /* Merge our lists to get one sorted list */ struct node* merge_sorted_lists(struct node* first_half, struct node* second_half) {   struct node* result = NULL;     if (first_half == NULL)     return(second_half);   else if (second_half==NULL)     return(first_half);     if (first_half->number <= second_half->number)   {     result = first_half;     result->next = merge_sorted_lists(first_half->next, second_half);   }   else   {     result = second_half;     result->next = merge_sorted_lists(first_half, second_half->next);   }   return(result); }   /* Divide the list into two lists */ void list_divide(struct node* original_list,           struct node** first_half, struct node** second_half) {   struct node* temp_one;   struct node* temp_two;     if (original_list ==NULL || original_list->next==NULL)   {     *first_half = original_list;     *second_half = NULL;   }   else   {     temp_two = original_list;     temp_one = original_list->next;       while (temp_one != NULL)     {       temp_one = temp_one->next;       if (temp_one != NULL)       {         temp_two = temp_two->next;         temp_one = temp_one->next;       }     }       *first_half = original_list;     *second_half = temp_two->next;     temp_two->next = NULL;   } }```
And now i can see that it's very fast and in my tests i got the same time of execution as with insertion sort. Any optimizations are welcome :).

Regards.
• 02-20-2012
iMalc
MergeSort is certainly very good for lists, better than quicksort really since that cant tend towards O(n^2).
However the killer is BucketSort. Merge Sort is a long way ahead of most other algorithms for linked-list sorting and then BucketSort is light-years ahead of that.
The catch of course is that your key needs to support being split into buckets, as well as just being directly comparable. In your case you are using ints for which it is fine.
Show 80 post(s) from this thread on one page
Page 1 of 2 12 Last