Thread: List Sorting Question

  1. #16
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    O_o

    It sounds to me like you are using the wrong tool for the job.

    Can you give some more details about your use pattern? (I'm mostly interested in the number of elements you expect to add between using the sorted data, the range of the element values, and the expense of the comparison function.)



    I initially just used the list:ush_back function to append all of the additional data structures to the end of the list and then run sort on the entire list.
    o_O

    Depending on the distribution of your data, you may save a lot of time, depending on the sorting algorithm you need to use, by partially sorting the data as you go. (In effect, making a "skip list" of sorts.)



    Maybe the ++ operator is very fast when you have a list of say integers, but it seems to be really slow when you have a list of data structures.
    The mechanism of the iterators (for std::list<???>) aren't changed by the element type.



    Since my data needs to be sorted so frequently I guess I just have to wait for it to sort.
    Yea. That screams "don't use a linked list" to me. If you are sorting more than you are adding elements, you are using the wrong data structure.



    If you're just trying to pick the shortest guy of the bunch or something, you can also brute force that in linear time, after you compile the list, and that is faster than most sorts.
    If he only needs the "shortest guy of the bunch", he only needs to store the "shortest guy of the bunch".

    Soma

  2. #17
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by Polystyrene View Post
    My question is would it be faster to make a smaller list containing the new data structures, sort it, and then merge it into the larger list.
    Yes I believe so, particularly when the number of items being inserted is a fair bit less than the number of items already inserted. Merging-in a sorted run being faster than merging those items in individually, is in fact the principle behind the Strand-Sort sorting algorithm.
    Before I got to that sentence, this was in fact going to be my exact suggestion. This assumes a std::list is the best container for the job of course.

    If your list is never allowed to have duplicates, then another option is to use a std::set instead of a std::list. The main difference being that it is always sorted. There's also multiset if you do have duplicates.

    Quote Originally Posted by Polystyrene View Post
    Ok so I tried using the lower_bound function to find the iterator where it should be inserted as whiteflags suggested. However, the program becomes very slow and according to the profiler I am spending 70% of my time performing the iterator++ operator.
    As it would. There is not much advantage in using lower_bound with iterators that are not random access. Its first step is to count the number of items in the range, i.e. evaluate std::distance(begin, end) which means iterating over the entire list. So it's not just that it may end up iterating over the entire list, it does explicitly iterate over the entire list before it even starts doing any comparisons. Typically even a linear search is faster than this would be. But if the comparison operator is far slower than iterating and the list is short enough, then it can be a win, but basically it was a poor suggestion.

    Quote Originally Posted by whiteflags View Post
    The whole point is that it is faster to sort all the unsorted data once than to ensure that the list is always sorted.
    I should clarify that. It just boils down to, don't sort the data it you don't need to access it in sorted order just yet. It does not means that it is faster to re-sort the whole thing each time. On the flipside, not all of the effort it sorting some of the data before it is needed sorted is lost, especialy if you dont throw away the information about which portion is already sorted.

    Quote Originally Posted by whiteflags View Post
    To loosely answer your question merging two lists and getting a sorted list out of it is a sort, and would be as fast as most other sorts. The only way to know in this case is to profile both strategies that you mention.
    That's not the only way to know. A Big-Oh notation anaylsis shows that if the number of items being added is sufficiently less than the number of items already present, then it is faster. I've done this previously and submitted it as an optimisation to the Loki library authors some time ago for their assoc_vector class. It tends toward O(n), whereas a full sort would be O(n*logn).
    Last edited by iMalc; 08-28-2010 at 03:06 PM.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. linked list question
    By mikeman in forum C Programming
    Replies: 1
    Last Post: 11-30-2008, 01:56 PM
  2. List Question
    By ConsulVortex in forum C++ Programming
    Replies: 3
    Last Post: 01-14-2006, 04:38 AM
  3. Linked List Help
    By CJ7Mudrover in forum C Programming
    Replies: 9
    Last Post: 03-10-2004, 10:33 PM
  4. compiler build error
    By KristTlove in forum C++ Programming
    Replies: 2
    Last Post: 11-30-2003, 10:16 AM

Tags for this Thread