Thread: Linked List Sorting

  1. #16
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Sure his question is unclear, but I have clearly stated that I took his question to be about the difference between taking n items and inserting each one one at a time into a list in order, vs appending all items into an unsorted list and then sorting it.
    Inserting in order as you go is O(n*n) because each of the n inserts will at worst require comparing against every item that has already been inserted. It's one of those sums 1 + 2 + 3 + 4 + 5 which comes out to n(n+1)/2 IIRC and so is N-squared, no doubt about it. Yes each insert is O(n), but in order to compare it to the sorting case, you have to consider all the big-Oh notation of all of the n inserts.

    As for sorting the array afterwards, nothing can sort in faster than O(n), so the O(n) inserts into the list to begin with will be irrelevant in terms of Big-Oh notation. Therefore the only thing relevant there is the Big-Oh notation of sorting a list. Merge sort does this in O(n*logn), in fact that's both best and worst case. Bucket sort would likely be even better.

    As for his last question, there is no correct answer due to lack of information. He hasn't defined what "efficient" means for starters. Memory efficient, time efficient? Some function of the two?
    Last edited by iMalc; 05-18-2013 at 04:25 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  2. #17
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    You also need to know N.
    O_o

    No. You don't.

    The situation plays out exactly as I've, twice, explained; as before, you are ignoring very real conditions and possibilities.

    *shrug*

    I just don't care about what reality you choose to ignore.

    The key issue is we don't know what the original poster needs, so all of this is speculation.
    Indeed, and that's what I've been saying from post #4.

    As for sorting the array afterwards, nothing can sort in faster than O(n), so the O(n) inserts into the list to begin with will be irrelevant in terms of Big-Oh notation. Therefore the only thing relevant there is the Big-Oh notation of sorting a list. Merge sort does this in O(n*logn), in fact that's both best and worst case. Bucket sort would likely be even better.
    I was under the impression you had been talking about keeping the list sorted, inserting into the correct position or sorting after each insert; that's a very different best.

    Soma

  3. #18
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by iMalc View Post
    taking n items and inserting each one one at a time
    There's a third sequence, receiving the numbers to be sorted one at a time, but not having to use the sorted data until all numbers have been received. In this case a merge sort that uses an array of lists could be used (this is how hp / microsoft implements linked list sort). Again, we still don't know what the original poster's goal is.

  4. #19
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by rcgldr View Post
    There's a third sequence, receiving the numbers to be sorted one at a time, but not having to use the sorted data until all numbers have been received.
    You're describing the same case as me, so if you consider that a third option, I don't know what you consider the second one is.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #20
    Registered User
    Join Date
    May 2012
    Posts
    505
    Quote Originally Posted by phantomotap View Post
    O_o

    No. You don't.

    The situation plays out exactly as I've, twice, explained; as before, you are ignoring very real conditions and possibilities.
    We do need to know N. Let's say we've got two algorithms, one the best common case, O(log N), and one the worst common case, O(2^N). Which one to choose?
    It depends on N. Is N is 100, then there's no way the O(2^N) algorithm is going to be viable. But what if N ranges from 0-8?
    log 8 is 3, 2^8 is 256. So the O(log N) algorithm will be about 100 times faster, in terms of operations. But what is the constant? It could well be about 100, if the O(2^N) operation is simply a memory write, the O(log N) operation involves allocating memory and freeing it - that will take 100 machine instructions or so.

    Then you've also got to take in account ease of implementation. If you can write the O(2^N) method in five minutes, whilst the O(log N) method will take the whole day to get right and debug (this is quite common, generally high O algorithms are straightforwards), then you've got to save a whole day of computer time to repay the investment. If O(N^2) takes a tenth of a second, O(log N) a tenth of millisecond, the function has to be run 24*60*60*10 times, or 864,000 times, before you're recouped the development time. So what do you want the function for? Is it in a time critical part of the code, is it run just once on program startup or in an inner loop?
    I'm the author of MiniBasic: How to write a script interpreter and Basic Algorithms
    Visit my website for lots of associated C programming resources.
    https://github.com/MalcolmMcLean


  6. #21
    Registered User
    Join Date
    May 2003
    Posts
    1,619
    Quote Originally Posted by Malcolm McLean View Post
    We do need to know N. Let's say we've got two algorithms, one the best common case, O(log N), and one the worst common case, O(2^N). Which one to choose?
    It depends on N. Is N is 100, then there's no way the O(2^N) algorithm is going to be viable. But what if N ranges from 0-8?
    log 8 is 3, 2^8 is 256. So the O(log N) algorithm will be about 100 times faster, in terms of operations. But what is the constant? It could well be about 100, if the O(2^N) operation is simply a memory write, the O(log N) operation involves allocating memory and freeing it - that will take 100 machine instructions or so.

    Then you've also got to take in account ease of implementation. If you can write the O(2^N) method in five minutes, whilst the O(log N) method will take the whole day to get right and debug (this is quite common, generally high O algorithms are straightforwards), then you've got to save a whole day of computer time to repay the investment. If O(N^2) takes a tenth of a second, O(log N) a tenth of millisecond, the function has to be run 24*60*60*10 times, or 864,000 times, before you're recouped the development time. So what do you want the function for? Is it in a time critical part of the code, is it run just once on program startup or in an inner loop?
    There are other examples, too. For example, one common problem I come across at work has two solutions - one an O(N) operation, the other an O(1). However, the O(1) has double the expected disk I/O of the O(N) operation, so for small N, the O(N) is actually more efficient even though it uses more CPU time and more memory reads, because the memory reads are much faster than the disk reads. Above a critical value of N, the O(1) operation becomes the most efficient. When making a choice between the two approaches, it's important to consider what values of N are expected.
    You ever try a pink golf ball, Wally? Why, the wind shear on a pink ball alone can take the head clean off a 90 pound midget at 300 yards.

  7. #22
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by rcgldr View Post
    There's a third sequence, receiving the numbers to be sorted one at a time, but not having to use the sorted data until all numbers have been received.
    Quote Originally Posted by iMalc View Post
    You're describing the same case as me, so if you consider that a third option, I don't know what you consider the second one is.
    What I thought was the second option was receiving the numbers to be sorted one at a time, but also receiving numbers to search for in the current state of the sorted list, so there's a search requirement in addition to the sort requirement while the numbers are being received. (For example the data to sort on could be the "key" portion of some type of database like record).

    Again, we still don't know what the original poster's goal is, and has he ever posted again in this thread? Is anything getting accomplished here (in terms of the original poster)?

  8. #23
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by rcgldr View Post
    What I thought was the second option was receiving the numbers to be sorted one at a time, but also receiving numbers to search for in the current state of the sorted list, so there's a search requirement in addition to the sort requirement while the numbers are being received. (For example the data to sort on could be the "key" portion of some type of database like record).
    Oh okay. Yeah for that case I'd simply go with option 1.

    Quote Originally Posted by Malcolm McLean View Post
    We do need to know N.
    I think Phantomotap has more of a problem with your use of the word "need" there, rather than your argument in general. I can tell because I've made the same mistake before.
    Yes sometimes it might be useful to know the typical or absolute range on N, but it is never actually necessary.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  9. #24
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    So, has a conclusive answer been posted yet? I think this is a good question, actually. We want to build a linked list, correct? Let's assume we're going to be reading numbers from stdin so they can be whatever and we'll do a sort by numbers, from smallest to greatest just for the sake of argument.

    The OP's question is, would it be more efficient to store these numbers as they are read in and then sort or would it be more efficient to read in a number and place it in its proper spot?

    I think a binary tree sorts these inputs the fastest, right? You'd read in a root value and then extend the branches based off of that but does that retain the linked list type of structure? I'm guessing not. That'd be better for something like, "Does this element already exist?"

    But if you examine as you insert, it might be faster, right? And that's because you know your structure is already sorted. You could just start at one side of the list and increment along until you found the proper insertion point. Like if we had 1->2->3->5 and we wanted to insert 4 you could start at the end or the beginning. Or you could examine the read-in value and compare it to the previously added value. If you automatically knew 4<5 then you insert 4 quickly and if you knew 6>5 you could insert quickly. This is assuming a doubly linked list though. Regular linked list would be much slower because you would have to start at 1 each time. That'd be horribly inefficient.

    So I think the OP should just make a doubly linked list and then start at the most end and go backwards. So we'd have 1<->2<->3<->5 and then we read in 7 and go, is 7>5? Yes! so we have 1<->2<->3<->5<->7. Then we want to insert 4. Is 4>7? No! Is 4>5? No! is 4>3? Yes! Then we have 1<->2<->3<->4<->5<->7. I think this should be faster than starting at 1 each time. Of course, this would fail if we had 1<->3<->4<->5<->6<->7 and you wanted to insert 2 because if you started at 7, you'd have to walk through almost the whole list.

    Otherwise, the only sorting algorithm I have/use is something like this, it starts with the second element in the list and increments forwards, looking backwards each time until it finds the proper sort point. How far it looks backwards is entirely dependent upon the number so it'd be a similar thing to what I posted.

  10. #25
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    So, has a conclusive answer been posted yet?
    O_o

    Well, someone hasn't read the thread at all.

    Soma

  11. #26
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    No, it's just a lot of reading and what seems to be bickering and a lot of, "We don't have the specifics yet".

    And you know, not every post has to being with O_o. It really just makes your head look like this: 8====D

  12. #27
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    And you know, not every post has to being with O_o.
    o_O

    a lot of, "We don't have the specifics yet".
    Because, as we've explained, without specifics there is only general guidelines.

    If a conclusive answer is required, more specifics are required.

    What don't you understand?

    Keep that in mind, now actually read the thread.

    Soma

  13. #28
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by MutantJohn View Post
    So, has a conclusive answer been posted yet?
    Maybe for the first question, but the OP is two questions, and for the first question, it's not clear if all the data is available before putting it into a linked list, and it's not clear if the second question has implications for the first question, such as will the list be searched while it's also being created?

    Quote Originally Posted by acho.arnold View Post
    Is it more efficient to sort an linked list after the values have been inserted into the list OR to sort and item before inserting into the list so as to create an ordered list.

    And also Please what is the most efficient data Structure to use for Random insertion and searching of unordered Data?
    For the first question, assuming all the data is available from the start, it would be quicker to sort the data first, then create a linked list using the sorted data, as opposed to creating the linked list first and then sorting the linked list.

    For the second question, it's not clear if the searching is supposed to take place at the same time that some type of data structure (like a list or tree) is being created.

    The OP never posted again in this thread, so it's not clear if the OP was trying to solve a specific problem, or just asking generic questions.
    Last edited by rcgldr; 05-19-2013 at 06:43 PM.

  14. #29
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by MutantJohn View Post
    No, it's just a lot of reading and what seems to be bickering and a lot of, "We don't have the specifics yet".

    And you know, not every post has to being with O_o. It really just makes your head look like this: 8====D
    Sounds like you didn't read my very specific answers yet then. I made what I believe is a very good assumption about what was being asked - the same assumption you made in fact. I then stated that that was the question I was answering, and answered it correctly.

    You however have come to a different conclusion. Without other information, sorting afterwards can and very likely would, be more efficient. Read my answers and you will understand why. Sorting afterwards would be equally as efficient as inserting as you went, if the algorithm chosen for sorting were insertion sort. In fact the processes would then be practically identical. Thus one must only choose an algorithm better than insertion sort to beat it.

    Yes Insertion Sort is about the most efficient algorithm for sorting an already sorted list (provided it checks against the last item first, as you say). But you failed in not stating that as an assumption, and in the absence of that assumption, the conclusion is in fact wrong.

    Name calling (implicit or otherwise) is not welcome on this board.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #30
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    Okay, so you discovered my secret, I'm not a computer scientist. I'm a physicist, both at heart and in mind. I do like programming but for me, I have to stop and do things one at a time until my physics brain can piece it all together. That's why the way I write is very drawn out and taken from examples. I'm good at justifying using theoretical models.

    Nevertheless, I've figured out a way to improve the efficiency of the algorithm (insertion sort!) and this will remedy the situation entirely. Let's just multithread this! Split the array into equally long (or about as equal as you can get it) segments in numbers equal to the number of threads. Or at least, you'd assign those indices to a specific processor and then begin examining elements that way. First thread to find the proper insertion point gets a cookie! And also subsequently inserts the new structure and establishes proper links.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sorting a linked list
    By Anass in forum C Programming
    Replies: 7
    Last Post: 10-27-2010, 06:56 AM
  2. Linked List Sorting
    By oddworld in forum C Programming
    Replies: 4
    Last Post: 04-27-2007, 10:42 PM
  3. Sorting a linked list
    By thoseion in forum C Programming
    Replies: 6
    Last Post: 11-13-2006, 10:34 AM
  4. sorting linked list
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 11-04-2001, 11:08 PM
  5. Sorting a linked list
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 09-18-2001, 04:49 PM