![]() |
| | #16 | |
| C++ Witch Join Date: Oct 2003 Location: Singapore
Posts: 11,358
| Quote:
__________________ C + C++ Compiler: MinGW port of GCC Build + Version Control System: SCons + Bazaar Look up a C/C++ Reference and learn How To Ask Questions The Smart Way | |
| laserlight is online now | |
| | #17 | ||||||||
| Algorithm Dissector Join Date: Dec 2005 Location: New Zealand
Posts: 2,742
| Quote:
Quote:
Quote:
Quote:
You'll find that it isn't really possible to learn such a huge amount about a various subject without ending up using the same terms commonly used, even though you're at the same time putting it in your own words. I'm sorry, I'm not so good with metaphors or quotes sometimes. Quote:
Quote:
Again, I'm not familiar with the metaphor. Nevermind, this time it's my turn to look something up! Quote:
. It's kind of hard to prove you came up with something independently of someone else when the facts lead you both down the same path though. I guess I should be flattered that any explanations I've given sound like something someone else once wrote. I didn't think I was that good at explaining stuff.Quote:
But I digress. Can we be of any more assistance with the original topic?
__________________ My homepage Advice: Take only as directed - If symptoms persist, please see your debugger Last edited by iMalc; 01-20-2009 at 03:16 AM. | ||||||||
| iMalc is offline | |
| | #18 | |
| critical genius Join Date: Jul 2008 Location: SE Queens
Posts: 5,212
| Quote:
Code: typedef struct _node { /* singly linked list */
char first[64];
char second[64];
int ID;
struct _node *next;
}node;
*there are lots of these kinds of (usually alphabetized) text lists of names and/or words around. But the real purpose of my post now is to shed some further light on the significance of the big O notation and the issues it conceals. Of course, deciding how to count the iterations was not as clear cut for all the sort styles. But here's some sample output: Code: [root~/C] ./a.out -h Usage: ./a.out [-h] [-f -l] [-1] [-2] [-d] -d either 1 or 2: copious output to stderr -h print this message -f filename -l number of records in list, needed with -f -1 primary sort criteria (1=first name, 2=last name[default], 3=number) -2 secondary sort criteria (0=none[default], 1=first name, etc.) Eg. "./a.out -f data.txt -l 27 -1 2 -2 1" Records should have the structure "Some Name:0101". [root~/C] ./a.out Using "", 0 records Primary Criteria offset=64 Command (h for help, Q to quit): h d display list h show this message b (modified) bubblesort i insertion sort m mergesort q quicksort s selection sort S shuffle N new random list R reload from file W write to file Command (h for help, Q to quit): N How many records? 10 New list is 1.41 kb Command (h for help, Q to quit): d 0x1a90010 nonglare,James (1) 0x1a900b0 stroys,James (420) 0x1a90150 enslave,Jim (1024) 0x1a901f0 mirex,Tammy (666) 0x1a90290 peartly,Lorenzo (420) 0x1a90330 soymilks,Bob (1024) 0x1a903d0 weeks,Robert (0) 0x1a90470 addrest,Tammy (69) 0x1a90510 starnose,Leslie (27) 0x1a905b0 acerous,Paul (1024) Command (h for help, Q to quit): N How many records? 10000 New list is 1406.25 kb Command (h for help, Q to quit): b Bubblesorting....... 97760000 Iterations Comparison calls: 97750224 Elapsed time: 0 min 31 sec Command (h for help, Q to quit): S Shuffled. Command (h for help, Q to quit): s Selectionsorting.... 99980002 Iterations Comparison calls: 49995000 Elapsed time: 0 min 19 sec Command (h for help, Q to quit): S Shuffled. Command (h for help, Q to quit): i Insertionsorting....... 24883640 Iterations Comparison calls: 24883640 Elapsed time: 0 min 6 sec So the bubble sort and the selection sort do about what is expected vis. "iterations", (10,000 squared), the insertion sort comes out quite a bit better. Iterations/comparisons remain at a constant ratio for the selection sort, whereas because the bubblesort is "modified" (using a flag as per my discussion with tabstop above), the bubblesort and the insertion sort have a best case of n or n-1 for both iterations and comparisons. The default is a string comparison, which is much more processor intensive than a numerical comparison (eg, selection sort will do 10,000 records by ID number in 8 seconds). The difference between the number of comparisons is what accounts for the speed difference above; while both bubble and selection sort did almost n^2 iterations, selection sort did half as many comparisons. It should be noted that while the iteration count may be subject to criticism, the comparison count is absolute as it is incremented every time a discrete comparison function is called. This discrepancy (that an iteration need not be a comparison) is not referred to in many discussions of sorting efficiency and is not accounted for with big O. A third measure is available with the number of recursive function calls in merge sort and quick sort. Because the sorts are similarly recursive, these counts are always the same: Code: Command (h for help, Q to quit): N How many records? 2097152 New list is 294912.00 kb Command (h for help, Q to quit): m Mergesorting..............2097151 function calls 61865985 Iterations Comparison calls: 41388898 Elapsed time: 0 min 15 sec Command (h for help, Q to quit): S Shuffled. Command (h for help, Q to quit): q Quicksorting..............2097151 function calls 148857828 Iterations Comparison calls: 75517673 Elapsed time: 0 min 32 sec Code: /* set primary criteria to "3" (ID number, of which there are only ten possible) and secondary to first name (of which there are 59 possible). So there will be lots and lots of ties to "resolve". Other cases in this demo use a single criteria (the last name) of which there are 80638 unique possibilities: */ [root~/C] ./a.out -1 3 -2 1 Using "", 0 records Primary Criteria offset=128 Secondary Criteria offset=0 Command (h for help, Q to quit): N How many records? 100000 New list is 14062.50 kb Command (h for help, Q to quit): m Mergesorting..............99999 function calls 2315025 Iterations Comparison calls: 2748000 Elapsed time: 0 min 0 sec Command (h for help, Q to quit): S Shuffled. Command (h for help, Q to quit): q Quicksorting..............99999 function calls 20542647 Iterations Comparison calls: 20123133 Elapsed time: 0 min 2 sec Command (h for help, Q to quit): N How many records? 1000000 New list is 140625.00 kb Command (h for help, Q to quit): m Mergesorting..............999999 function calls 27884993 Iterations Comparison calls: 34081612 Elapsed time: 0 min 6 sec Command (h for help, Q to quit): S Shuffled. Command (h for help, Q to quit): q Quicksorting..............999999 function calls 1836821111 Iterations Comparison calls: 1832407830 Elapsed time: 4 min 4 sec Notice the number of recursions is still the same -- an imbalanced list does not increase the number of recursions, since the short half simply goes through fewer. But the number of iterations and comparisons with quicksort can change radically: Code: Command (h for help, Q to quit): N How many records? 100 New list is 14.06 kb Command (h for help, Q to quit): q Quicksorting..............99 function calls 1256 Iterations Comparison calls: 694 Elapsed time: 0 min 0 sec Command (h for help, Q to quit): q Quicksorting..............99 function calls 9800 Iterations Comparison calls: 4950 Elapsed time: 0 min 0 sec Now, apparently merge sort will perform the same in either case. This is true if we just measure iterations, but if we take more reality into account, the number of comparisons can, will, and must change. This is because when two split lists come back to be merged, if one list already contains all smaller values, all of the values on the left will be compared one at a time to the first element of the right, and then (since the returned lists have an internal order) no further comparisons need be done and the right is simply iterated onto the end. That is the normative and perhaps only way to implement a merge sort: Code: /* the end (merge) of the sort function */
while ((left) && (right) && (lenL>LC) && (lenR>RC)) {
Iterations++;
result=compfunc(offset,left,right);
if ((result==0) && (altfunc)) result=altfunc(altset,left,right);
if (result==2) { ptr->next=right; ptr=right; right=right->next; RC++; }
else { ptr->next=left; ptr=left; left=left->next; LC++; }
} /* if one list is longer or contains more high values, leaving a final remainder */
while ((left) && (lenL>LC)) { ptr->next=left; ptr=left; left=left->next; LC++; Iterations++; }
while ((right) && (lenR>RC)) { ptr->next=right; ptr=right; right=right->next; RC++; Iterations++; }
ptr->next=NULL; /* the new tail */
Code: Command (h for help, Q to quit): N How many records? 100 New list is 14.06 kb Command (h for help, Q to quit): m Mergesorting..............99 function calls 817 Iterations Comparison calls: 545 Elapsed time: 0 min 0 sec Command (h for help, Q to quit): m Mergesorting..............99 function calls 817 Iterations Comparison calls: 316 Elapsed time: 0 min 0 sec To give quicksort one last chance, I decided to add a function which could select a better pivot, despite my misgivings about the additional iterations and comparisons required. So now, the quicksort compares the head with an element about half way thru, and if these are identical, goes thru the whole list until it finds one that isn't. Then we take a mean of the two different values. This helps to ensure that that we will actually have two lists. This clearly helped to divert quicksort (which it should really be contrasted with mergesort as "splitsort", but anyway...) away from it's worst case, the already sorted list: Code: Command (h for help, Q to quit): N How many records? 100 New list is 14.06 kb Command (h for help, Q to quit): q Quicksorting..............99 function calls 1443 Iterations Comparison calls: 882 Elapsed time: 0 min 0 sec Command (h for help, Q to quit): q Quicksorting..............99 function calls 1127 Iterations Comparison calls: 754 Elapsed time: 0 min 0 sec Then I modified the newlist() function to assign random ID's just from rand(), rather than a fixed set of 10. That means billions of possibilities on a 64-bit so I could now compare the sorts on scrambled sets of millions of unique numbers: Code: [root~/C] ./a.out -1 3 Using "", 0 records Primary Criteria offset=128 Command (h for help, Q to quit): N How many records? 4000000 New list is 562500.00 kb Command (h for help, Q to quit): m Mergesorting..............3999999 function calls 123539969 Iterations Comparison calls: 82696100 Elapsed time: 0 min 9 sec Command (h for help, Q to quit): S Shuffled. Command (h for help, Q to quit): q Quicksorting..............4000000 function calls 190179315 Iterations Comparison calls: 100817020 Elapsed time: 0 min 23 sec A final interesting point: both the quicksort and the insertion sort demonstrated the variation inherent in their design; with insertion sort this variation is toward a "best case" (improving on n^2), whereas with quicksort this is toward a "worst case" (away from n*log(n)). A quick sort cannot really do better than a merge sort although the implementation of a quicksort may be easier to tweak closer to the best case, but in any case would be better known as the "split sort" or "quack sort". I put up a nice syntax-colored version of the code which produced this nonsense, in case someone wants to argue some silly point: MK27's linked list comparative sort demo program! I believe this is fairly portable. It could almost be used as a crude API, since the sort functions use external comparison functions with offsets to criteria (meaning you could plug any kind of struct into it without changes). Last edited by MK27; 01-26-2009 at 06:30 PM. | |
| MK27 is online now | |
| | #19 |
| and the Hat of Guessing Join Date: Nov 2007
Posts: 8,854
| Bubblesort is n-1 iterations, each of which makes 1, 2, 3, ..., n comparisons. So the comparison count is n^2, not the iteration count. (Again, big O does not concern itself with constants or non-important terms. 4n^2 is O(n^2), n^2+n is O(n^2), etc.) |
| tabstop is online now | |
| | #20 |
| critical genius Join Date: Jul 2008 Location: SE Queens
Posts: 5,212
| With the bubblesort the ratio of iterations to comparisons is exact (in the demo, comparisons is invariably within 0.5% of iterations). Big O count cannot be based on comparison count and must be based on iteration count, as I pointed out with reference to merge sort, or else merge sort cannot have a constant best/average/worst rating. |
| MK27 is online now | |
| | #21 |
| Registered User Join Date: Sep 2006
Posts: 3,152
| I'm not sure why you belabored yourself with testing the Quicksort algorithm, for linked lists, since it is not designed for any data that must be retrieved (relatively), slowly. Whether it's a linked list, a tape drive, or data that must be accessed over a (relatively slow) network, Quicksort is simply not the algorithm of choice. Mergesort is designed for that purpose. Put your data into an array or static disk drive, give Quicksort some smart tweaks (and watch out because it is "fragile"), and *then* you can watch Quicksort *fly*. |
| Adak is offline | |
| | #22 | |
| critical genius Join Date: Jul 2008 Location: SE Queens
Posts: 5,212
| Quote:
Don't be silly Adak. The weaknesses I concerned myself with vis. "quicksort" are inherit to the algorithm and independent of the implementation. | |
| MK27 is online now | |
| | #23 | ||||
| C++ Witch Join Date: Oct 2003 Location: Singapore
Posts: 11,358
| Quote:
Quote:
Quote:
Quote:
__________________ C + C++ Compiler: MinGW port of GCC Build + Version Control System: SCons + Bazaar Look up a C/C++ Reference and learn How To Ask Questions The Smart Way | ||||
| laserlight is online now | |
| | #24 | ||||||||
| Algorithm Dissector Join Date: Dec 2005 Location: New Zealand
Posts: 2,742
| Fisher-Yates huh? I had to lok that one up. It turns out that I'm using Durstenfeld's version of the algorithm, which is the fast, modern way of implementing that algorithm. But as I've tried to show previously with other stuff I've posted about, I wasn't told about this algorithm, or been asked to implement it. I simply needed to shuffle an array and I wrote code that would do it in such a way that to me would obviously produce no bias. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Incidentally I've recently become of the opinion that Splay-Tree-Sort is in many ways the ultimate doubly-linked-list sorting algorithm. It's just so amazingly adaptive, and the list links can be reused as tree child pointer links during the sorting. Any initial sort order you can throw at it, other than random order, gives astoundingly low comparison counts. If this interests you, I highly recommend trying it out. Quote:
Mine is implemented in C++ rather than C and this means I don't have to make any modifications whatsoever to the algorithms to animate then and count comparisons etc. The same code you would use in a real program is the exact code I use in my demo app. I recommend you download it and check it out as I'm sure you'll enjoy it. It's linked at the bottom of many of my pages. Last edited by iMalc; 01-27-2009 at 01:08 AM. | ||||||||
| iMalc is offline | |
| | #25 |
| Banned Join Date: Aug 2009
Posts: 43
| you just pick on pointer store its value and search in the rest of the list for the minimal value and swap the values then go next to the next node |
| kakaroto is offline | |
![]() |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| seg fault for linked list in another linked list--help!! | idshiy | C Programming | 5 | 11-04-2006 06:39 PM |
| Reverse function for linked list | Brigs76 | C++ Programming | 1 | 10-25-2006 10:01 AM |
| sort linked list using BST | Micko | C Programming | 8 | 10-04-2004 02:04 PM |
| Sort Linked List Algorithm. | xddxogm3 | C++ Programming | 7 | 10-13-2003 11:05 PM |
| Template Class for Linked List | pecymanski | C++ Programming | 2 | 12-04-2001 09:07 PM |