Thread: Semaphore review

  1. #16
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by Codeplug View Post
    It's not very useful. As soon as the copy is returned, its value is stale. Making decisions based on that stale value can easily lead to synchronization/data-race bugs. Feel free to post code that tries to make use of it and we'll try to code review it.
    I will definitely take you up on that offer once I get everything sorted out.
    Syncing N threads where mutual exclusion isn't required seems to be rather difficult...

    Down/P and Up/V are the traditional names. I was only trying to point out that you have them backwards. "Up" should "increment" and signal.
    Oh crap. You're right >_<
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  2. #17
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    Sorry for the partial hijack Elysia.

    I was asked to explain the "I almost always break work up into transactions." comment.

    For the sake of understanding, let's consider the example of video codecs and the encoded streams.

    A traditional approach to threading an intra-frame codec would probably pair a work queue with overlapped IO and a simple producer/consumer. (I'm trying to be abstract here.) The manager (or whatever you wish to call it) would spend most of its life asleep waiting on IO completion to add a new job to the queue. The workers would each consume a blob (a part of a frame, a full frame, a second of video) and add the resulting data acting as a client of the producer. The consumer would organize the data into a final form ready for writing to the stream (file).

    This abstract explanation sounds really slow, but many codecs and streams can be threaded with less overhead than I'm implying because the implementation can use multiple queues and break the input up into chunks that are result in components directly used in the stream. So for example, a codec that operates on blocks (say an 8*8 array of pixels) as the smallest unit of operation can be paired with a quality thread pool implementation with a per frame queue resulting in overhead of only a couple dozen very fast operations. (In this case a quality thread pool is one that supports targeted dispatch. In other words, a thread pool smart enough to give all jobs marked "type 1" to the same thread.) The approach essentially breaks work up into very finely grained chunks only to assign multiple sequential chunks to the same thread but potentially interleaved with other chunks. (So, jobs (1,1), (2,1), (1,2), (1,3), (2,2) and so on.)

    *shrug*

    Admittedly, even that is a silly example. It depends on the codec and the stream, but better options exist for different levels of granularity. It does show a decent approach. The limiting factor for scalability, ignoring IO throughput, is almost certainly going to be the bit at the end that sequences frames. Fortunately, that bit is only a few percent of the total algorithm. Increasing real threads will very likely scale performance of the encoding jobs until pushing max IO throughput on both ends long before the cost of sequencing frames becomes an issue.

    Now, that out of the way I an explain how my video archival system works (also an intra-frame codec). I use a very different approach. (Note, I said "different" not "better".) The "breaking source video into chunks" part is performed "on the fly", but how the source is "chunked" is decided in advance. That is, the entire file is split into conceptual chunks and uploaded to different machines "on the fly" before any encoding occurs. (In practice, to prevent starvation do to slow IO throughput I buffer a given number of chunks.) Each "chunk" gets a transaction identifier that is listed in a table. This table is created before any transactions occur. The "encoding the chunks" part is handled in a similar fashion. Each job has a specific task to perform that consumes a specific chunk and writes a specific chunk. That "encoding job" information is recorded to a list (which lives where the job lives), again well before any threading or encoding happens, with transaction identifiers. The "serialize everything into a reasonable stream" part is also handled in a similar fashion. We already have transaction identifiers. That's only part of what we need. We also need to know how large each transaction will be. We can provide that information without needing to protect access to anything. (We don't need to protect access because we aren't going to read anything until everything has been written and everything is written to a unique location.) Finally, the individual streams are reordered into a master stream (the final file in the real format). Even that operation is partially threaded and we know where everything is because of all the "logging".

    *shrug*

    Again, not better, just different.

    Soma

  3. #18
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    I need something that can be used to declare that something holds (ie, water is pouring out of the sky). This primitive must block until the condition is true before continuing on. Lastly, any waiting threads must not be able to clear that condition (eg, a thread must be unable to say, 'Water was just pouring out of the sky, so water is no longer pouring out of the sky').
    A good way to model this is Windows events (very handy!). So I modeled my own class using semaphores (because I have to use them). This is the results (pseudo code):

    Code:
    void wait()
    {
    	while (!m_flag)
    	{
    		m_semaphore.down();
    		m_semaphore.up();
    	}
    }
    
    void signal()
    {
    	m_flag = true;
    	m_semaphore.up();
    }
    
    void release()
    {
    	m_flag = false;
    	m_semaphore.down();
    }
    Do tell me if there is anything wrong with it, please.
    Mostly I am worried that the semaphore will end up in a state where count > 1, hence we'll get a busy loop in wait. That must not happen, but it seems so difficult to ensure that it does not.
    m_flag should obviously be atomic.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  4. #19
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Updated code:
    Code:
    #include <tbb/atomic.h>
    #include "semaphore.hpp"
    
    class XEvent
    {
    public:
    	XEvent(bool initial_state): m_semaphore(initial_state ? 1 : 0), m_threadsafety(1)
    	{
    		if (!initial_state)
    			m_flag = false;
    		else
    			m_flag = true;
    	}
    
    	void wait()
    	{
    		m_threadsafety.down();
    		assert(m_threadsafety.get_count() == 0);
    		while (!m_flag)
    		{
    			m_threadsafety.up();
    			assert(m_threadsafety.get_count() == 1);
    			m_semaphore.down();
    			assert(m_semaphore.get_count() == 0);
    			m_threadsafety.down();
    			assert(m_threadsafety.get_count() == 0);
    			m_semaphore.up();
    			assert(m_semaphore.get_count() == 1);
    		}
    		m_threadsafety.up();
    		assert(m_threadsafety.get_count() == 1);
    	}
    
    	bool get_status()
    	{
    		XSemaphoreLock lock(m_threadsafety);
    		return m_flag;
    	}
    
    	void signal()
    	{
    		XSemaphoreLock lock(m_threadsafety);
    		if (m_flag)
    			return;
    		m_flag = true;
    		m_semaphore.up();
    		assert(m_semaphore.get_count() == 1);
    	}
    
    	void clear()
    	{
    		XSemaphoreLock lock(m_threadsafety);
    		if (!m_flag)
    			return;
    		m_flag = false;
    		m_semaphore.down();
    		assert(m_semaphore.get_count() == 0);
    	}
    
    protected:
    	tbb::atomic<bool> m_flag;
    	XSemaphore m_semaphore;
    	XSemaphore m_threadsafety;
    };
    (I probably have to take into account that signal and clear can be called multiple times in sequence.)
    Last edited by Elysia; 02-19-2012 at 06:41 AM. Reason: Thread safety; bugs
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  5. #20
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> So I modeled my own class using semaphores (because I have to use them).
    So the requirement is to build a synchronization object with the same semantics of a Win32 auto-reset event, but only using semaphores?

    >> tbb::atomic<bool>
    That's not a semaphore (assuming the answer above is "yes"). If you can only use semaphores, then use a regular bool and a binary semaphore to synchronize access to it. (so, I wrote the preceding before reading all the code...) I this case, m_threadsafety seems to be doing just that. So only a "bool" is needed.

    The last assert() in wait() doesn't make sense. Once you no longer have exclusive ownership (count == 0) of m_threadsafety, you can't assert anything about its current count. Eg: T1 context switch between line 30 and 31, T2 acquires m_threadsafety, T1 asserts. Same thing goes for line 22.

    Yes, I don't think your implementation will work right. I would first implement a condition variable object using semaphores. Once you have that, your event solution is in post #15

    gg

  6. #21
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Yeah, I realized all the asserts are probably wrong since they aren't performed as an atomic action.
    Anyway, the while loop in wait is basically supposed to emulate a condition variable, though, I think (although I suppose it doesn't guarantee the same thread won't grab the lock over and over).

    As I understand it, atomic variables shouldn't be a problem. But no mutex, condition variables, etc. It should be a Windows even with manual reset semantics.

    It seems to work rather fine, so I am thinking that, at least on my machine, the race conditions are rare enough to potentially overlook it (considering I have little time left now).
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  7. #22
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> But no mutex, condition variables, etc.
    A binary semaphore is a mutex. A condition variable can be built from semaphores as well. With these two constructs, an Event can be constructed using the algorithm in post #15. Everything is still built from semaphores...

    Here is condition variable pseudo code:
    Code:
    sem: wait_queue(0), mutex(1)     
    int: num_blocked=0
    
    // pre/post - mux is acquired
    condition_wait(sem: mux) 
    {
        P(mutex)
            ++num_blocked
        V(mutex)
        V(mux)
            P(wait_queue)
        P(mux)
    }
    
    condition_notify_one()
    {
        P(mutex)
            if (num_blocked)
            {
                --num_blocked
                V(wait_queue)
            }
        V(mutex)
    }
    
    condition_notify_all()
    {
        P(mutex)
            while (num_blocked)
            {
                --num_blocked
                V(wait_queue)
            }
        V(mutex)
    }
    It's doesn't have "fairness" built-in, but the semantics are correct.

    It's just easier to rationalize correctness of these smaller primitives - making it easier to rationalize correctness of objects built from these primitives.

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 11-14-2010, 11:14 AM
  2. semaphore
    By menzeli in forum Linux Programming
    Replies: 12
    Last Post: 04-15-2007, 09:26 AM
  3. semaphore
    By menzeli in forum C Programming
    Replies: 1
    Last Post: 03-24-2007, 11:34 AM
  4. Mutex vs. Semaphore
    By cjschw in forum C++ Programming
    Replies: 1
    Last Post: 07-22-2003, 07:03 PM
  5. semaphore
    By laasunde in forum C Programming
    Replies: 0
    Last Post: 11-01-2002, 11:03 AM