Thread: new license-free lock-free data structure library published

  1. #1
    Registered User
    Join Date
    Dec 2009
    Posts
    83

    new license-free lock-free data structure library published

    I'm making a post here to let people know about this library, since I think it may be useful; if such a post is inappropriate or offensive, please let me know!

    liblfds is a portable, license-free, lock-free data structure library, written in C.

    The library has already been ported to a number of platforms (Windows (user-mode and kernel), Linux, IA64, x86, x64, ARM, GCC, MS C compiler, etc).

    Release 5 was published late last night :-)

    Right now there's only a few data structures - the queue and ringbuffer are the most useful - but a *lot* of work has been made into making the library accessable; there's a blog, forum, wiki and bugzilla on the web-site and a metric *ton* of documentation.

    http://www.liblfds.org
    Last edited by Toby Douglass; 12-20-2009 at 07:47 AM.

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    Moved to Projects and Job Recruitment.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Why are you crippling the compilers ability to optimize with all those volatile's everywhere?

    gg

  4. #4
    Malum in se abachler's Avatar
    Join Date
    Apr 2007
    Posts
    3,195
    Quote Originally Posted by Codeplug View Post
    Why are you crippling the compilers ability to optimize with all those volatile's everywhere?

    gg
    probably to ensure multithreaded safeness, although I haven't looked at the code, declaring a variable that may be accessed by different threads as volatile is one of the situations that the volatile keyword is specifically for.

  5. #5
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by Codeplug View Post
    Why are you crippling the compilers ability to optimize with all those volatile's everywhere?
    Those variables are subject to atomic compare-and-swap operations; the compiler cannot know ahead of time which CAS will cause the destination to be modified and which not, so it must simply treat the variable as volatile.

    Also, those variables are shared between threads running on seperate cores, which means from the POV of any one CPU, they are subject to abrupt and unpredictable change. Memory barriers are needed to propagate those changes in a timely manner (volatile won't do it) but volatile is needed because otherwise the generate code may for example cache a value in a register (or optimise in certain ways which will fail) and so fail to notice the changes made by other threads.

    The compiler is generating code which is valid for multiple threads on one CPU, not multiple threads on multiple CPUs.

  6. #6
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    volatile has nothing to do with multi-threading or its correctness.

    More reading and external links on the subject: Lockless Threaded Queue

    gg

  7. #7
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by abachler View Post
    probably to ensure multithreaded safeness,
    Multi-core safeness, not multi-thread safeness. I may be wrong, but I think you don't need volatile for multiple threads on a single core.

  8. #8
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by Codeplug View Post
    volatile has nothing to do with multi-threading or its correctness.
    Even over multiple cores in the presence of memory barriers?

    How will the compiler/CPU know not to cache/optimize access to a particular variable, when that variable could be changing underneath them?

  9. #9
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> but volatile is needed because otherwise the generate code may for example cache a value in a register
    The compiler isn't going to hoist a variable into a register unless it can guarantee that there will be no side effects to that variable across "opaque" function calls. I don't see this issue anywhere in the code.
    The other thing to realize is that this is only an issue for variables in which 2 or more threads may hold a reference. Stack variables are local to a single thread and therefore don't require volatile.

    >> ... (or optimize in certain ways which will fail)
    Preventing optimization is no way to guarantee correctness. What other "certain ways" are your referring to? Using volatile to prevent reordering of reads and writes by the compiler would be pointless, since cache-coherency protocols in SMP's can reorder reads and writes anyway.

    One thing I see in freelist_push() is that you're limiting the scalability by using a static counter. In other words, two threads with two different freelists shouldn't have to compete for the one global 'counter'. Of course, we're just talking about a few extra do/while iterations, but the fewer membar's issued the better.

    I enjoy this type of stuff and will keep poking-at and reviewing the code.

    gg

  10. #10
    Registered User jeffcobb's Avatar
    Join Date
    Dec 2009
    Location
    Henderson, NV
    Posts
    875
    I find this kind of stuff quite interesting as well; I just downed the src and haven't had a chance to look at it much yet so I will be very curious how you deal with concurrent access. Not long ago I was playing with a lockless producer/consumer model and never had the time to finish it but was getting some interesting results from having the consumer thread calculate how much work it did how fast and then the next time it hit the queue (and actually got possession) it will extract up to that many units of work (jobs, data blocks, whatever). If the producer went to add to the queue and the queue was "busy" it had a secondary queue that it would push the work into...then when it did get possession of the main queue it just dumped the secondary queue into it. There was a lot more to it than that; for example it would calculate the load it was putting on the overall machine and so you could set through configuration "never use more than 50% of the CPU time" so that the machine maintained a predictable level of responsiveness....when processes noted that it was approaching CPU saturation it would back off on the work it did until it kept a balance...

    I felt that there was more to this than I was able to complete at the time but had too many plates spinning in the air (so to speak) as it was so this fell off onto the back-burner....
    C/C++ Environment: GNU CC/Emacs
    Make system: CMake
    Debuggers: Valgrind/GDB

  11. #11
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by Codeplug View Post
    >> but volatile is needed because otherwise the generate code may for example cache a value in a register
    The compiler isn't going to hoist a variable into a register unless it can guarantee that there will be no side effects to that variable across "opaque" function calls. I don't see this issue anywhere in the code.
    Yes; I think I'm wrong about this one.

    The other thing to realize is that this is only an issue for variables in which 2 or more threads may hold a reference. Stack variables are local to a single thread and therefore don't require volatile.
    I think I could be convinced this is so...

    >> ... (or optimize in certain ways which will fail)
    Preventing optimization is no way to guarantee correctness.
    Yes. That was just an example of a way you could fail; eliminating *a* way you can fail does not make you correct. There are lots of other ways you can still fail.

    What other "certain ways" are your referring to? Using volatile to prevent reordering of reads and writes by the compiler would be pointless, since cache-coherency protocols in SMP's can reorder reads and writes anyway.
    What I'm thinking of is where you have a variable shared between threads. The variable begins at 0. One thread is reading the variable repeatedly till it's value reaches say 10. The other thread is incrementing. The two threads run on seperate cores. Now, to get this to work, first you need a memory barrier after the increment to ensure timely visibility (or you might reach 11 or 12, etc, before the first core notices), second, you need a re-ordering barrier, to prevent the the seperate cores re-ordering inappropriately and that barrier also needs to prevent the compiler from doing the same thing (for example, the loop may run exactly 15 times and the incrementing counter could be converted into a single += 15 prior to the loop). Third, I'm thinking you need volatile, because from the POV of the first core, the value is spontaniously changing - how can it know the other core has changed the value? having said this now, as we talk about it, I'm thinking however that the memory barrier handles this problem fully.

    One thing I see in freelist_push() is that you're limiting the scalability by using a static counter. In other words, two threads with two different freelists shouldn't have to compete for the one global 'counter'. Of course, we're just talking about a few extra do/while iterations, but the fewer membar's issued the better.
    Yea Gods, that's a *terrible* mistake on my part! Thank God for peer review - I'll fix this in the morning. (What on earth was my brain doing...!)

    I enjoy this type of stuff and will keep poking-at and reviewing the code.
    I'm just hoping you don't find too many mistakes like that one! :-)

  12. #12
    Registered User jeffcobb's Avatar
    Join Date
    Dec 2009
    Location
    Henderson, NV
    Posts
    875
    After looking at this I noticed that in almost every case the "lock free" behavior is achieved at the expense of being able to delete items in a thread-safe manner....not sure that this is a show-stopper, gotta noodle on it some more.
    C/C++ Environment: GNU CC/Emacs
    Make system: CMake
    Debuggers: Valgrind/GDB

  13. #13
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by jeffcobb View Post
    After looking at this I noticed that in almost every case the "lock free" behavior is achieved at the expense of being able to delete items in a thread-safe manner....not sure that this is a show-stopper, gotta noodle on it some more.
    There's no memory management algorithm. All of the data structures run off a free-list. You can 'delete', in so far as that operation makes sense, for the various data structures - the freelist has a push, the queue has it's dequeue - but you can't delete in the sense of freeing up an element and returning that memory to the system. All you're actually doing is returning the element to the freelist.

    Of course, all of this would change with the introduction of memory management algorithm. That will come, but I think a few more data structures are needed first.

  14. #14
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    freelist_push() doesn't look correct to me. The purpose of the do/while loop is to detect any asynchronous changes to "fs->top" by other threads. If there is an asynchronous change, then the loop starts over and tries again. You should read a new "snapshot" of "fs->top" at the top of your loop. (Suggest renaming "original_fe_next" to "top_snapshot".)

    I see the same issue in freelist_pop(). Suggest renaming "fe_local" to "top_snapshot" - and refresh your snapshot at the top of the loop.

    gg

  15. #15
    Registered User
    Join Date
    Dec 2009
    Posts
    83
    Quote Originally Posted by Codeplug View Post
    freelist_push() doesn't look correct to me. The purpose of the do/while loop is to detect any asynchronous changes to "fs->top" by other threads. If there is an asynchronous change, then the loop starts over and tries again. You should read a new "snapshot" of "fs->top" at the top of your loop.
    You are correct and this in fact does happen - but not in an obvious way.

    abstraction_dcas() writes the original value of destination into the compare argument ("original_fe_next"). So when the loop goes around, and original_fe_next is loaded into fe_local, we are getting an updated version of fs->top.

    Will give the name change thought when I get time after work!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Tab Controls - API
    By -KEN- in forum Windows Programming
    Replies: 7
    Last Post: 06-02-2002, 09:44 AM
  2. gcc problem
    By bjdea1 in forum Linux Programming
    Replies: 13
    Last Post: 04-29-2002, 06:51 PM
  3. File Database & Data Structure :: C++
    By kuphryn in forum C++ Programming
    Replies: 0
    Last Post: 02-24-2002, 11:47 AM
  4. Serial Communications in C
    By ExDigit in forum Windows Programming
    Replies: 7
    Last Post: 01-09-2002, 10:52 AM
  5. vector static abstract data structure in C?
    By Unregistered in forum C Programming
    Replies: 2
    Last Post: 11-05-2001, 05:02 PM