Thread: Yet another n00b in pthreads ...

  1. #1
    Registered User
    Join Date
    Mar 2008
    Posts
    16

    Unhappy Yet another n00b in pthreads ...

    Hi,
    I am new in pthread library and perhaps my question is naive. Anyway, here is the problem I want to solve. I want to create N >= 2 threads and one of them (from now on "generator") will always fill in with "chunks of data" a shared queue, while the rest (N-1) threads will get the data from the queue (assuming there is something) - call these (N-1) "workers". For this purpose I decided to use a mutex for references in queue, as well as a condition variable. Hence, my idea is that, if the workers can't find data in queue, they should wait for a signal. Obviously the "generator" sends that signal once new data have been added into the queue. So, conceptually, here is how I implemented those functions initially (numbered lines for clarity):
    Code:
    01: generator(){
    02:   while (!created_required_amount_of_data) {
    03:     data <-- create_data ()
    04:     lock (mutex)
    05:     enqueue (data)
    06:     if (there_are_threads_waiting)
    07:       send_condition_signal ()  // <---- NOT broadcast
    08:     unlock (mutex)
    09:   }
    10: }
    I guess the above is fine. Let's switch to the problematic "worker":
    Code:
    01: worker (){
    02:   while (!processed_required_amount_of_data) {
    03:     lock (mutex)
    04:     if (queue_is_empty()) {
    05:       waiting_threads++
    06:       wait_for_signal ()
    07:       waiting_threads--
    08:     }
    09:     data <-- dequeue_data ()
    10:     unlock (mutex)
    11:     process (data)
    12:   }
    13: }
    Unfortunately, the above scenario does NOT work always. It seems to be working *always* if I change line 4 in the "worker" version from an 'if' statement to a 'while' statement.

    Now, here comes my first question:
    1) Is it possible to use signal and wake just a *single one* thread from those waiting for a signal? In the documentation of pthread_cond_signal it is stated that *at least one* thread will wake up. But I only want *exactly* one ...

    And the above change from 'if' to 'while' saves me almost always on runtime, since one of the threads that wake up check if the queue is empty, and because another one was faster on waking up, they end up sleeping again! But of course, my impression is that this works only because it is a very unfortunate scenario for the scheduler *not* to work! (I hope I make sense). E.g. two threads wake up; first asks if queue is empty; gets the correct answer (non-empty); and now the scheduler stops that thread and gives priority to the other one who woke up as well. The 2nd thread performs the same check, it is falsely interpreted as a non-empty queue (because the first one didn't make it to dequeue something - lol), and both threads will attempt very soon to dequeue. But if nothing arrives in between, one of the threads will get a NULL pointer, and a beautiful "Segmentation fault" (or "Bus error" on a mac) will occur.

    Second question:
    2) Now, how on earth am I going to overcome this problem? :-) I want a deterministic solution to the problem and not something magic like above!
    I thought this was the idea of a mutex and a condition variable ...

    Source code and makefile can be obtained here so that you can re-produce the problem to your machines. The critical 'if' / 'while' statements are on lines 205-206 on the random_nums.cpp.

    I would be really thankful if you could shed some light on the above. Really, any comment is appreciated because I a complete newbie to this.
    Thank you in advance
    Last edited by dimis; 03-28-2008 at 04:39 PM. Reason: minor details

  2. #2
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by dimis View Post
    1) Is it possible to use signal and wake just a *single one* thread from those waiting for a signal? In the documentation of pthread_cond_signal it is stated that *at least one* thread will wake up. But I only want *exactly* one ...
    That's exactly why you check the condition in a while loop, not an if statement. More than one thread might wake up -- but only one of them will succeed in locking the mutex, and only that thread will proceed. The others will just go back to sleep.

  3. #3
    Registered User
    Join Date
    Mar 2008
    Posts
    16

    Question Still ...

    ... I don't get it. I already gave a "fake" example on how things might go wrong. The problem is that once at least 2 threads wake up, then all of them assume that the mutex is locked for them. Am I wrong on that? I hope not. (In my example workers lock before the critical inner while loop.) May be my English is not so good and cause my ambiguity. A quote from the manual:
    The thread(s) that are unblocked shall contend for the mutex according to the scheduling policy (if applicable), and as if each had called pthread_mutex_lock().
    Can someone paraphrase this?
    Thank you very much brewbuck for participating.

  4. #4
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    Now, on a second thought, I can solve all my (n00bish) ambiguities, by assigning a different queue to each "worker" and even use the 'if' statement version of the code.

    Still though, I am interested in the problem of the single queue as was presented above. Perhaps in the future I won't have such a flexibility like this time.

    I would like to see some comments and suggestions from others.

  5. #5
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    then all of them assume that the mutex is locked for them
    That would defeat the purpose of a mutex. A mutex makes sure that only one thread can lock it at a time.

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by dimis View Post
    ... I don't get it. I already gave a "fake" example on how things might go wrong. The problem is that once at least 2 threads wake up, then all of them assume that the mutex is locked for them. Am I wrong on that?
    Yes, pretty much wrong. The thread wakes up somewhere inside the threading layer's wait function. The first thing this function will do upon waking up is try to regain the mutex. If another thread beats it, it will block here until that thread releases the mutex.

    At that point, the condition which originally caused the threads to wake up might have changed. Which is why your code must ALWAYS check the condition in a loop, not an if-statement, because you might be woken up even if the condition no longer holds.

  7. #7
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    That would defeat the purpose of a mutex. A mutex makes sure that only one thread can lock it at a time.
    I am not saying that more than one threads have locked the mutex. I am saying that more than one may assume that they have locked the mutex. Read my quote from the manual.

    The first thing this function will do upon waking up is try to regain the mutex. If another thread beats it, it will block here until that thread releases the mutex.
    Wrong. The thread will not block by default if it hasn't acquired the mutex and has just been signaled (read my quote from the manual). A dummy program can convince you on that (even my code).

    At that point, the condition which originally caused the threads to wake up might have changed. Which is why your code must ALWAYS check the condition in a loop, not an if-statement, because you might be woken up even if the condition no longer holds.
    You are right about the loop, but my implementation above is wrong (even with the while loop I suggested). The problem in my code above is that both threads might pass the loop - read my scenario above again.
    The only correct implementation of the while loop that I can think of is with the use of pthread_mutex_trylock in a PTHREAD_MUTEX_RECURSIVE type mutex. In other words, the code between lines 4 and 8 on the worker function should take the form:
    Code:
    04:     while (true) {
    05:       result = try_lock (mutex) // Same mutex as before - must be recursive.
    06:       if (result == success) {
    07:         unlock (mutex) // Decrement it to/by one.
    08:         if (queue_not_empty)
    09:           break  // this thread has locked the mutex and can retrieve something from the queue
    10:       }
    11:       waiting_threads++
    12:       wait_for_signal ()
    13:       waiting_threads--
    14:     }
    And this variation is bulletproof. Only one thread can exit the while loop *no matter what* the scheduler decides.
    But the above is a n00bish approach. Is there a better approach to "talk" to a shared data structure?

    Or to re-write the code so that we can avoid scrolling:
    Code:
    01: worker (){
    02:   while (!processed_required_amount_of_data) {
    03:     lock (mutex)
    04:     while (true) {
    05:       result = try_lock (mutex) // Same mutex as before - must be recursive.
    06:       if (result == success) {
    07:         unlock (mutex) // Decrement it to one.
    08:         if (queue_not_empty)
    09:           break  // this thread has locked the mutex and can retrieve something from the queue
    10:       }
    11:       waiting_threads++
    12:       wait_for_signal ()
    13:       waiting_threads--
    14:     }
    15:     data <-- dequeue_data ()
    16:     unlock (mutex)
    17:     process (data)
    18:   }
    19: }
    Last edited by dimis; 03-29-2008 at 06:18 PM. Reason: spelling / minor correction on pseudocode

  8. #8
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    I don't believe that brewbuck has said anything wrong.
    I think there's just some confusion as to what's really going on. The first thing that should be made clear is this:
    When each thread unblocked as a result of a pthread_cond_broadcast() or pthread_cond_signal() returns from its call to pthread_cond_wait() or pthread_cond_timedwait(), the thread shall own the mutex with which it called pthread_cond_wait() or pthread_cond_timedwait().
    So if you return, you own the mutex - which means only one thread at a time will ever return from pthread_cond_wait() on a single mutex. And no one else will return from pthread_cond_wait() until that mutex is unlocked.

    So now on to the reason why you should always loop on your "condition" being signaled - It's actually in the posted documentation:
    On a multi-processor, it may be impossible for an implementation of pthread_cond_signal() to avoid the unblocking of more than one thread blocked on a condition variable. ...
    The effect is that more than one thread can return from its call to pthread_cond_wait() or pthread_cond_timedwait() as a result of one call to pthread_cond_signal(). This effect is called "spurious wakeup". ...
    An added benefit of allowing spurious wakeups is that applications are forced to code a predicate-testing-loop around the condition wait. This also makes the application tolerate superfluous condition broadcasts or signals on the same condition variable that may be coded in some other part of the application. The resulting applications are thus more robust. Therefore, IEEE Std 1003.1-2001 explicitly documents that spurious wakeups may occur.
    Just keep in mind that one cond_signal will get you one or more returns from a cond_wait. If you do get more than one return from a cond_wait, they will never happen at the same time since each return occurs with the mutex locked.

    Changing your original pseudo code "if" to a "while" will make it "correct pseudo code" - correct real code is something else

    >> The problem in my code above is that both threads might pass the loop - read my scenario above again.
    When you change the "if" to a "while", only one thread will ever be unblocked inside that while loop. Don't confuse "spurious wakeups" as two threads running the loop at the same time. You just have to plan on your "condition" being false, even though you've returned from a cond_wait call.

    Now I don't really see this as a noob-approach. It doesn't really need to be any more complicated than this unless absolutely necessary.

    gg

  9. #9
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by dimis View Post
    Wrong. The thread will not block by default if it hasn't acquired the mutex and has just been signaled (read my quote from the manual). A dummy program can convince you on that (even my code).
    Again, wrong. No thread will return from pthread_cond_wait() unless the associated mutex is held by that thread. This is basic threading stuff. Telling me I'm wrong when you come in here claiming to be a "n00b" and I try to answer your question is... bizarre.

  10. #10
    Registered User
    Join Date
    Mar 2008
    Posts
    16

    Thumbs up Thank you guys

    I don't know what to say. You are obviously right and I am obviously wrong. Justification soon. If I offended anyone I apologize; it was not my intention. Up until this post I made the mistake that Codeplug suggested; namely spurious wakeups as two or more threads running the loop the same time. Thank you for reading my example and observing my mistake.

    Now, let me give you some source code for the next guy who will show up with the same idea I had. The program creates N threads, one of them (signaling) sleeps for 1 second until the rest are initialized for sure (I create it last one as well). The other ones, acquire the mutex, and wait for a signal by the "signaling" one. After a second, the signaling thread sends the signal, and ONLY one of the other threads prints a message, exactly as all of you have suggested. However, even after releasing the mutex once again, no other thread wakes up in my machine, no matter how many threads I created (I've tested several values up to even 2000).

    But now I have another question.
    How can we have an example (even based on chance) where at least two threads wake up by a single signal?

    And another thing: Is there a good known mechanism for countering a problem of a "shared data structure" just like I wanted, or the approach with the while loop is simple and fine after all?

    May be I have some other questions as well, but I have to think a little bit more.
    Thank you all for replying. Oh, and by the way, I really meant that I want someone to paraphrase this:
    The thread(s) that are unblocked shall contend for the mutex according to the scheduling policy (if applicable), and as if each had called pthread_mutex_lock().
    Now the program that I promised above.
    The output with just three threads:
    Code:
    $ gcc lol.c -pthread -o lol
    $ ./lol 
        1: I own the mutex. I will increment those waiting for a condition.
           I will release the mutex now and call pthread_cond_wait ().
        2: I own the mutex. I will increment those waiting for a condition.
           I will release the mutex now and call pthread_cond_wait ().
    
        3: I own the mutex and I am about to signal.
           2 threads waiting for a condition.
    
    -----------------------------------------------------------------------
    
           Signal sent. Now releasing mutex.
    
    -----------------------------------------------------------------------
    
        1: 1 thread is waiting.
           I will release the mutex to check how many woke up.
        1: mutex released.
    
    ^C
    $

    And the source code:
    Code:
    /*
    ** How many threads wake up?
    */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <pthread.h>
    
    typedef struct {
    	int id;
    	int * waiting;
    	pthread_mutex_t *	mutex;
    	pthread_cond_t *	condition;
    } DATA;
    
    #define NUM_THREADS 3
    #define SLEEPING_THREADS (NUM_THREADS - 1)
    
    void * signaling_function (void *);
    void * waiting_function (void *);
    
    int main (void)
    {
    	int				i, rc;
    	void *			status;
    	DATA			data_struct [NUM_THREADS];
    	pthread_t		threads [NUM_THREADS];
    	pthread_attr_t	attr;
    	pthread_mutex_t m;
    	pthread_cond_t	cond;
    	int				those_waiting;
    
    	/* Initializations. */
    
    	/* Make threads Joinable for sure. */
    	pthread_attr_init (&attr);
    	pthread_attr_setdetachstate (&attr, PTHREAD_CREATE_JOINABLE);
    	/* mutex and condition variable. */
    	pthread_mutex_init (&m, NULL);
    	pthread_cond_init (&cond, NULL);
    	those_waiting = 0;	/* nobody is waiting for something at the beginning. */
    	/* Pass the ids to the threads. */
    	for (i = 0; i < NUM_THREADS; i++) {
    		data_struct [i].waiting = &those_waiting;
    		data_struct [i].id = i + 1;
    		data_struct [i].mutex = &m;
    		data_struct [i].condition = &cond;
    	}
    
    	/* Create threads. */
    	for (i = 0; i < SLEEPING_THREADS; i++) {
    		rc = pthread_create(&(threads[i]), &attr, waiting_function, (void *) &(data_struct [i]));
    		if (rc) {
    			printf ("ERROR; return code from pthread_create() is &#37;d.\n", rc);
    			exit (2);
    		}
    	}
    	/* Create the signaling function last. */
    	rc = pthread_create(&(threads[NUM_THREADS - 1]), &attr, 
                                           signaling_function, (void *) &(data_struct [NUM_THREADS - 1]));
    	if (rc) {
    		printf ("ERROR; return code from pthread_create() is %d.\n", rc);
    		exit (1);
    	}
    
    	/* Wait for threads to join. */
    	for (i = 0; i < NUM_THREADS; i++)
    		pthread_join ( threads [i], &status );
    	printf ("Threads joined!\n");
    
    	pthread_exit (NULL);
    }
    
    void * signaling_function (void * input)
    {
    	DATA * myData = (DATA *) input;
    
    	sleep (1);	/* Enough time so that others can "hibernate" ... */
    	pthread_mutex_lock (myData->mutex);
    	printf ("\n%5d: I own the mutex and I am about to signal.\n", myData->id);
    	printf ("       %d threads waiting for a condition.\n", *(myData->waiting));
    	printf ("\n-----------------------------------------------------------------------\n\n");
    	pthread_cond_signal (myData->condition);
    	sleep (1);
    	printf ("       Signal sent. Now releasing mutex.\n");
    	printf ("\n-----------------------------------------------------------------------\n\n");
    	pthread_mutex_unlock (myData->mutex);
    	
    	pthread_exit (NULL);
    }
    
    void * waiting_function (void * input)
    {
    	DATA * myData = (DATA *) input;
    
    	pthread_mutex_lock (myData->mutex);
    	printf ("%5d: I own the mutex. I will increment those waiting for a condition.\n", myData->id);
    	(*(myData->waiting))++;
    	printf ("       I will release the mutex now and call pthread_cond_wait ().\n");
    	pthread_cond_wait (myData->condition, myData->mutex);
    	(*(myData->waiting))--;
    	sleep(1);
    	if (*(myData->waiting) != 1)
    		printf ("%5d: %d threads are waiting.\n", myData->id, *(myData->waiting));
    	else
    		printf ("%5d: %d thread is waiting.\n", myData->id, *(myData->waiting));
    
    	/* You can comment the following 3 lines ... */
    	printf ("       I will release the mutex to check how many woke up.\n");
    	pthread_mutex_unlock (myData->mutex);
    	printf ("%5d: mutex released.\n\n", myData->id);
    
    	pthread_exit (NULL);
    }
    Last edited by dimis; 03-30-2008 at 11:31 AM. Reason: to fold code so that it isn't too wide

  11. #11
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    >> How can we have an example (even based on chance) where at least two threads wake up by a single signal?
    There may not be any way to *cause* a spurious wakeup to occur. You can detect when one occurs:
    Code:
       while (queue_is_empty()) {
          waiting_threads++
          wait_for_signal ()
          waiting_threads--
          if (queue_is_empty())
             log_spurious_wakeup();
       }
    >> ... or the approach with the while loop is simple and fine after all?
    It's a classic text-book example of using condition variables. There are other solutions, but there's no reason not to use condition variables.

    The thread(s) that are unblocked shall contend for the mutex according to the scheduling policy (if applicable), and as if each had called pthread_mutex_lock().
    All this is saying is, *if* more than one thread wakes up due to a signal or broadcast, then they will behave *as if* they all called mutex_lock() at the same time - probably because that's exactly what they are doing.

    gg

  12. #12
    Registered User
    Join Date
    Mar 2008
    Posts
    16

    Question win32 Pthreads

    Ok, a question with a different flavor.
    I have to port to windows a simple program that used pthreads. For this purpose, I think this is ideal. I followed the instructions found on faq (mainly question 8) but I have a problem on the linking process. I am using Visual C++ 2005 Express Edition (now available 2008). I downloaded the pthreads-w32-2-8-0-release.tar.gz (found near the end here). I placed the prebuilt pthreadVC2.dll under my C:\WINDOWS directory, and the three header files under C:\Program Files\Microsoft Visual Studio 8\VC\include which is the only directory in my INCLUDE variable. Similarly, I placed pthreadVC2.lib file under the directory C:\Program Files\Microsoft Visual Studio 8\VC\lib. Moreover, under
    Code:
    Tools->Options ...->Projects and Solutions->VC++ Directories
    I also inserted the directory where the header files and the .lib file could be found in the unzipped directory of what I initially downloaded. Yet, although this thing compiles, it doesn't link:
    Code:
    1>------ Build started: Project: MCthreads, Configuration: Debug Win32 ------
    1>Compiling...
    1>math_toolbox.cpp
    1>MonteCarlo.cpp
    1>queue.cpp
    1>Generating Code...
    1>Compiling manifest to resources...
    1>Linking...
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_join referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_mutex_unlock referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_cond_signal referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_mutex_lock referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_create referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_cond_init referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_mutex_init referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_attr_setdetachstate referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_attr_init referenced in function _main
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_exit referenced in function "void * __cdecl worker_function(void *)" (?worker_function@@YAPAXPAX@Z)
    1>mc.obj : error LNK2019: unresolved external symbol __imp__pthread_cond_wait referenced in function "void * __cdecl worker_function(void *)" (?worker_function@@YAPAXPAX@Z)
    1>Debug\MCthreads.exe : fatal error LNK1120: 11 unresolved externals
    1>Build log was saved at "file://c:\Documents and Settings\dimis\Desktop\MCthreads\Debug\BuildLog.htm"
    1>MCthreads - 12 error(s), 0 warning(s)
    ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
    Any suggestions on what might went wrong?

  13. #13
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    you need to tell VC to link to the library, just having the files is not enough.

    I am not sure how you would do that on VC, but on gcc it's
    Code:
    gcc ... -lpthread
    or
    Code:
    gcc ... -pthread
    on POSIX.

  14. #14
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Add "pthreadVC2.lib" in the linker project settings. Or add the following to your "main.cpp"
    Code:
    #ifdef _MSC_VER
    #   pragman comment(lib, "pthreadVC2.lib")
    #endif
    gg

  15. #15
    Registered User
    Join Date
    Mar 2008
    Posts
    16
    Thanks. The #pragma definition was a savior. I don't know what I was doing wrong on the linker project settings, but this certainly works. Thanks again. :good:

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. using pthreads with an API that uses pthreads
    By MK27 in forum C Programming
    Replies: 3
    Last Post: 03-06-2009, 02:47 PM
  2. Pthreads performance
    By C_ntua in forum C Programming
    Replies: 42
    Last Post: 06-17-2008, 11:29 AM
  3. Replies: 3
    Last Post: 04-16-2007, 12:02 PM
  4. Difference between win32 and linux pthreads
    By philipsan in forum C Programming
    Replies: 1
    Last Post: 02-07-2006, 04:57 PM
  5. pthreads and resources
    By ubermensch in forum C Programming
    Replies: 2
    Last Post: 02-07-2006, 02:27 AM