Thread: Linked List Sorting

  1. #46
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by MutantJohn View Post
    Anyway, what I meant was, we'd multithread the search algorithm.
    That's not entirely stupid, one could certainly do that under some circumstances. E.g. when sorting by one key and looking up via another, one could create a form of partial index in an array, and perform lookups in parallel. (As per the OrderSort algorithm by Matthew Aitkenhead and Mark Richards - 2005)
    However, generally if modifying the data structure was an option, I would change it entirely, and what I chose would probably not have anything that would benefit from threading.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  2. #47
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    I'm learning so much in this topic!!! <3

  3. #48
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    2,665
    Quote Originally Posted by Cat View Post
    Just because only one thread writes doesn't mean it's thread safe. When you're changing the order of elements in the linked list, you can't be traversing the list in another thread.
    I bet to differ. You'd just be shifting a pointer which is fine, it's just where it is. There is a danger area but so long as I stay away from that, nothing bad would happen. Basically, the prev, curr and next elements would be off-limits at the insertion site (curr = insertion site) but if there's a pointer off in lalala land, the links should remain undamaged.

    Also, I was not planning on writing while simultaneously traversing. The idea was, multithread the traversal of the list and as soon as an insertion site is located, kill all other threads, insert the point and then rearrange the pointers based off the new length of the list.

    I think this doesn't dissatisfy any race conditions.

    But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck. But I don't know anything about this so I need more explanation.

  4. #49
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    I bet to differ.
    O_o

    Well, that doesn't much matter.

    ThreadWrite: TargetIndexFound = ListItem[4]->Next
    ThreadRead: Cursor = ListItem[4]->Next
    ThreadWrite: ListItem[4]->Next = InsertElement(Data = NewData, Next = TargetIndexFound)

    There you have a "race condition". The "ThreadRead" thread has a pointer to "ListItem[5]" instead of the new item inserted by "ThreadWrite" at that position thus skipping it altogether.

    I was not planning on writing while simultaneously traversing.
    Well, that actually does prevent the "race condition".

    The idea was, multithread the traversal of the list and as soon as an insertion site is located, kill all other threads, insert the point and then rearrange the pointers based off the new length of the list.
    Inserting at a known position is constant and rearranging the pointer is "O(n)" with the number of threads or amortized constant.

    That's not a terrible idea, but I have to point you to what iMalc keeps saying: you have now involved threads, sequencing, presumably a lock-free work queue or similar, and so significantly complicated the implementation of the code. (You'd want a means of communicating the new positions to existing threads instead of tearing the threads down and building them up again every insertion.) Several of the "binary search tree" data structures always inserts in "O(log n)" with a single thread. Also, the majority of programmers have far more trouble reasoning about code using multiple threads.

    But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck.
    That's not what he said.

    Different architectures do things in different ways, but certainly architectures do exist with multiple execution cores sharing the same processor level cache, system memory, and bus even though different fractions of these resources can be marked for a specific core. The implied problem comes from data lines and different threads invalidating regions of memory occupying the same page within system memory forcing other cores to dump and refresh their copy of the cache. If you want to know more about this stuff: google is your friend.

    You say the idea would be "independent address", but to guarantee that you'd need a custom allocator and separate slices of that allocator for each thread. This, again, complicates the implementation, and again, using the right data structure would still be faster. As I've referenced a couple of times in this thread, even if you are willing to throw that much work at it, why wouldn't you start with a more appropriate data structure.

    Soma

  5. #50
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by MutantJohn View Post
    But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck. But I don't know anything about this so I need more explanation.
    I would be that "other guy". The issue is that for a typical PC, the cpu is much faster than memory, so for a simple operation like a search of an array or list, what is throttling the performance is the fact that the cpu is waiting on memory. Insertion operations into sub-arrays that fit in each core's cache could benefit from multi-threading, but I'm not sure if this could be done using the example processes mentioned in the original post.

    As mentioned there's only one memory bus, and typically only one outer cache shared between cores. Each core usually has a small cache, but there's not much to be gained here unless the cores are doing a significant number of reads and writes to their local caches without conflicting and invalidating the caches in the other cores.

    Where multi-threading can be useful is when you have cpu intensive operations that are mostly independent of each other, such as rendering an image or frame of video, where the image can be split up into rectangles for each core to operate on (with some special handling for the boundaries). Another case for multi-threading is for concurrent I/O operations where each thread waits for I/O completion, and the program relies on multi-threading to accomplish concurrent I/O without having to implement completion call back type functions which is more complicated.

  6. #51
    Registered User
    Join Date
    May 2003
    Posts
    1,619
    Quote Originally Posted by MutantJohn View Post
    Also, I was not planning on writing while simultaneously traversing. The idea was, multithread the traversal of the list and as soon as an insertion site is located, kill all other threads, insert the point and then rearrange the pointers based off the new length of the list.
    Wait, so you're creating N threads and killing N-1 for every insertion into the list? While that might work, I think it's going to be slower, especially with the overhead of creating threads. Certainly it will be slower than just using a better data structure (like a tree, which is well optimized for random inserts into an ordered data structure). In fact you could make a hybrid tree/linked list data structure if there was some reason you *really* needed a linked list; you often see that in B+ trees for fast in-order traversal.

    But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck. But I don't know anything about this so I need more explanation.
    It depends on the architecture, but in general, yes you will often see machines that have a single memory bus. It's worse when you have multiple processors touching the same data - multi-channel memory configurations help alleviate some of the problem by simultaneously querying different modules of memory, but each bit lives in only one module. Caches help, but modifying data in one thread can introduce cache coherency considerations.

    Really, the larger point, though, is that RAM access is slow compared to CPU. Multithreading works best when you need to do a large number of operations per each small amount of data. For example, transforming each point in a 3d mesh with multiple matrices, and doing lighting and texture calculations - the time it takes to do all the arithmetic is your limiting factor, not the speed at which you can pull the next piece of data.

    On the other hand, in this problem, the CPU's work is incredibly trivial - determine which of two numbers is larger, and branch. The time it takes to do all the calculation is tiny compared to the time to read the data itself.
    Last edited by Cat; 05-21-2013 at 06:38 PM.
    You ever try a pink golf ball, Wally? Why, the wind shear on a pink ball alone can take the head clean off a 90 pound midget at 300 yards.

  7. #52
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Quote Originally Posted by Cat View Post
    you will often see machines that have a single memory bus. It's worse when you have multiple processors touching the same data - multi-channel memory configurations help alleviate some of the problem by simultaneously querying different modules of memory.
    That would be "unganged" multi-channel memory, which could help with multi-threading, but I'm not sure how a multi-threading program would be able to allocate it's virtual / paged memory so that the physical memory ended up in different banks of unganged multi-channel memory. I suspect that the main advantage of unganged multi-channel memory is for multi-processing, where each process has it's own virtual address space, and the operating system could allocate different processes physical memory from different memory banks, assuming that an operating system would know how physical memory was mapped into unganged multi-channel memory.

    As for ganged multi-channel memory, all that is done is that memory is read in groups, increasing the bandwidth, but not independent access of the different banks of memory.

    Wiki article:

    Multi-channel memory architecture - Wikipedia, the free encyclopedia

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Sorting a linked list
    By Anass in forum C Programming
    Replies: 7
    Last Post: 10-27-2010, 06:56 AM
  2. Linked List Sorting
    By oddworld in forum C Programming
    Replies: 4
    Last Post: 04-27-2007, 10:42 PM
  3. Sorting a linked list
    By thoseion in forum C Programming
    Replies: 6
    Last Post: 11-13-2006, 10:34 AM
  4. sorting linked list
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 11-04-2001, 11:08 PM
  5. Sorting a linked list
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 09-18-2001, 04:49 PM