Linked List Sorting

**iMalc** · 05-20-2013

Originally Posted by MutantJohn

Anyway, what I meant was, we'd multithread the search algorithm.

That's not entirely stupid, one could certainly do that under some circumstances. E.g. when sorting by one key and looking up via another, one could create a form of partial index in an array, and perform lookups in parallel. (As per the OrderSort algorithm by Matthew Aitkenhead and Mark Richards - 2005)
However, generally if modifying the data structure was an option, I would change it entirely, and what I chose would probably not have anything that would benefit from threading.

**MutantJohn** · 05-21-2013

I'm learning so much in this topic!!!

<3

**MutantJohn** · 05-21-2013

Originally Posted by Cat

Just because only one thread writes doesn't mean it's thread safe. When you're changing the order of elements in the linked list, you can't be traversing the list in another thread.

I bet to differ. You'd just be shifting a pointer which is fine, it's just where it is. There is a danger area but so long as I stay away from that, nothing bad would happen. Basically, the prev, curr and next elements would be off-limits at the insertion site (curr = insertion site) but if there's a pointer off in lalala land, the links should remain undamaged.

Also, I was not planning on writing while simultaneously traversing. The idea was, multithread the traversal of the list and as soon as an insertion site is located, kill all other threads, insert the point and then rearrange the pointers based off the new length of the list.

I think this doesn't dissatisfy any race conditions.

But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck. But I don't know anything about this so I need more explanation.

**phantomotap** · 05-21-2013

I bet to differ.

O_o

Well, that doesn't much matter.

ThreadWrite: TargetIndexFound = ListItem[4]->Next
ThreadRead: Cursor = ListItem[4]->Next
ThreadWrite: ListItem[4]->Next = InsertElement(Data = NewData, Next = TargetIndexFound)

There you have a "race condition". The "ThreadRead" thread has a pointer to "ListItem[5]" instead of the new item inserted by "ThreadWrite" at that position thus skipping it altogether.

I was not planning on writing while simultaneously traversing.

Well, that actually does prevent the "race condition".

The idea was, multithread the traversal of the list and as soon as an insertion site is located, kill all other threads, insert the point and then rearrange the pointers based off the new length of the list.

Inserting at a known position is constant and rearranging the pointer is "O(n)" with the number of threads or amortized constant.

That's not a terrible idea, but I have to point you to what iMalc keeps saying: you have now involved threads, sequencing, presumably a lock-free work queue or similar, and so significantly complicated the implementation of the code. (You'd want a means of communicating the new positions to existing threads instead of tearing the threads down and building them up again every insertion.) Several of the "binary search tree" data structures always inserts in "O(log n)" with a single thread. Also, the majority of programmers have far more trouble reasoning about code using multiple threads.

But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck.

That's not what he said.

Different architectures do things in different ways, but certainly architectures do exist with multiple execution cores sharing the same processor level cache, system memory, and bus even though different fractions of these resources can be marked for a specific core. The implied problem comes from data lines and different threads invalidating regions of memory occupying the same page within system memory forcing other cores to dump and refresh their copy of the cache. If you want to know more about this stuff: google is your friend.

You say the idea would be "independent address", but to guarantee that you'd need a custom allocator and separate slices of that allocator for each thread. This, again, complicates the implementation, and again, using the right data structure would still be faster. As I've referenced a couple of times in this thread, even if you are willing to throw that much work at it, why wouldn't you start with a more appropriate data structure.

Soma

**rcgldr** · 05-21-2013

Originally Posted by MutantJohn

But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck. But I don't know anything about this so I need more explanation.

I would be that "other guy". The issue is that for a typical PC, the cpu is much faster than memory, so for a simple operation like a search of an array or list, what is throttling the performance is the fact that the cpu is waiting on memory. Insertion operations into sub-arrays that fit in each core's cache could benefit from multi-threading, but I'm not sure if this could be done using the example processes mentioned in the original post.

As mentioned there's only one memory bus, and typically only one outer cache shared between cores. Each core usually has a small cache, but there's not much to be gained here unless the cores are doing a significant number of reads and writes to their local caches without conflicting and invalidating the caches in the other cores.

Where multi-threading can be useful is when you have cpu intensive operations that are mostly independent of each other, such as rendering an image or frame of video, where the image can be split up into rectangles for each core to operate on (with some special handling for the boundaries). Another case for multi-threading is for concurrent I/O operations where each thread waits for I/O completion, and the program relies on multi-threading to accomplish concurrent I/O without having to implement completion call back type functions which is more complicated.

**Cat** · 05-21-2013

Originally Posted by MutantJohn

Also, I was not planning on writing while simultaneously traversing. The idea was, multithread the traversal of the list and as soon as an insertion site is located, kill all other threads, insert the point and then rearrange the pointers based off the new length of the list.

Wait, so you're creating N threads and killing N-1 for every insertion into the list? While that might work, I think it's going to be slower, especially with the overhead of creating threads. Certainly it will be slower than just using a better data structure (like a tree, which is well optimized for random inserts into an ordered data structure). In fact you could make a hybrid tree/linked list data structure if there was some reason you *really* needed a linked list; you often see that in B+ trees for fast in-order traversal.

But that other guy said something about memory buses and that my code would be effectively single-threaded. I find that dumb because the idea of multiple cores should be to interact with the RAM at independent addresses but if it has to wait one at a time then everything multithreaded would seem to suck. But I don't know anything about this so I need more explanation.

It depends on the architecture, but in general, yes you will often see machines that have a single memory bus. It's worse when you have multiple processors touching the same data - multi-channel memory configurations help alleviate some of the problem by simultaneously querying different modules of memory, but each bit lives in only one module. Caches help, but modifying data in one thread can introduce cache coherency considerations.

Really, the larger point, though, is that RAM access is slow compared to CPU. Multithreading works best when you need to do a large number of operations per each small amount of data. For example, transforming each point in a 3d mesh with multiple matrices, and doing lighting and texture calculations - the time it takes to do all the arithmetic is your limiting factor, not the speed at which you can pull the next piece of data.

On the other hand, in this problem, the CPU's work is incredibly trivial - determine which of two numbers is larger, and branch. The time it takes to do all the calculation is tiny compared to the time to read the data itself.

**rcgldr** · 05-21-2013

Originally Posted by Cat

you will often see machines that have a single memory bus. It's worse when you have multiple processors touching the same data - multi-channel memory configurations help alleviate some of the problem by simultaneously querying different modules of memory.

That would be "unganged" multi-channel memory, which could help with multi-threading, but I'm not sure how a multi-threading program would be able to allocate it's virtual / paged memory so that the physical memory ended up in different banks of unganged multi-channel memory. I suspect that the main advantage of unganged multi-channel memory is for multi-processing, where each process has it's own virtual address space, and the operating system could allocate different processes physical memory from different memory banks, assuming that an operating system would know how physical memory was mapped into unganged multi-channel memory.

As for ganged multi-channel memory, all that is done is that memory is read in groups, increasing the bandwidth, but not independent access of the different banks of memory.

Wiki article:

Multi-channel memory architecture - Wikipedia, the free encyclopedia

Thread: Linked List Sorting

Thread Tools

Search Thread

Display

Similar Threads

Sorting a linked list

Linked List Sorting

Sorting a linked list

sorting linked list

Sorting a linked list