Thread: Semaphores Problems

  1. #1
    Registered User
    Join Date
    Oct 2006
    Posts
    7

    Semaphores Problems

    Hello

    I am using a simple sunchronization between threads, one compute, the second send dependency to other processors, and the thread received dependency from other processors, I am using MPI to send/receive between processors, and in each process, these three threads to do the inner computation of the parts done by each one,

    The problem is in the threads syncronization, I use semaphores, I wait on the sending thread till there is a dependency that needs to be sent, and on the receiving thread till the computation require a dependency and need to check the MPI to receive it, I have three semaphores, one between the computation and sending, one between the computation and receiving, and one between the receiving and the computation to notify that it received something to be checked,

    I initialize the three semaphores as :
    Code:
      
    if (sem_init(&icSem, 0, 0) != 0 ) {
    		printf ("Error Initializing Semephore icSem, Exiting\n");
    
    		getSemError ();
    
      	return;
    }
    and post like that:

    Code:
     
     if (sem_post(&rcSem) != 0) {
    		printf ("Error post Semephore rcSem, Exiting\n");
    
    		getSemError ();
    
      	return NULL;
      }
    and wait like this:
    Code:
    	  if (sem_wait(&dsSem) != 0) {
    			printf ("Error waiting on Semephore dsSem, Exiting\n");
    
    			getSemError ();
    
    	  	return NULL;
    	  }
    in the getSemError () I read the error and type the message as in the man files,

    then I get this in the run time:
    The call was interrupted by a signal handler. when I wait on the computation thread for the receiving thread to receive a dependency that I need to resume computing,and same problem happen when I wait in the sending thread on another semaphore, it fails to wait with the same error, practically on all sem_wait calls,

    and when I read the value of the semaphore, it sometimes exceeds 1, I read values like 2 and 3 as well, while on every sem_post, there is a previously called sem_wait that once receives it should work on it,

    I used semaphores, because I thought semaphores signals remain in memory, and order of calls don't affect execution, I previously tried pthread conditions (pthread_cond_wait / pthread_cond_signal), and it didn't work, and I learned that if I signal and nothing is waiting, the signal will disappear, and semaphores solved this problem, as I can wait either before or after the post signal, and the wait will block only if the semaphore value is zero, otherwise (i.e. a post was already received) it will decrement and come back

    now, this understanding doesn't seem working either with semaphore,

    I appreciate any help in this issue, as it has taken so much time from me, and I am afraid I might be in a wrong direction completely,

    I appreciate a redirection to another forum (preferebly by the developers of the semaphores library for ANSI C in linux and/or pthreads library) or other concurrency gurus,

    Thank you very much,
    Manal

  2. #2
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by mhelal View Post
    I used semaphores, because I thought semaphores signals remain in memory, and order of calls don't affect execution, I previously tried pthread conditions (pthread_cond_wait / pthread_cond_signal), and it didn't work, and I learned that if I signal and nothing is waiting, the signal will disappear, and semaphores solved this problem
    The semaphores solve the problem because they maintain a piece of state, namely the semaphore count. The problem is you weren't using condition variables correctly. There's a reason they are called condition VARIABLES If you try to use them as a signal, which is basically what you tried to do, as you see, the signal vanishes if nobody is immediately blocked trying to receive it. What you need to do is pair the condvar with a real variable, and when you signal a state change in the condvar you also set the value of this variable.

    That way if some other thread comes along later, it locks the mutex of the condvar and checks the state variable, and sees that an event has occurred.

    As for the semaphore calls ending early because of signals, you just have to check for that whenever you call a semaphore function. If it was interrupted by a signal, just try again.

    Does any of that help?

  3. #3
    Registered User
    Join Date
    Oct 2006
    Posts
    7
    Hi

    Thank you very much for helping,

    I think waiting on a condition then is not what I need, and I better stay with semaphores and fix it,

    I tried to loop on sem_wait till it succeeds, as

    Code:
    	  while (sem_wait(&rcSem) != 0) {
    			printf ("Error waiting on Semephore rcSem error %d, Exiting\n", errno);
    			getSemError ();
    
    	  	return NULL;
    	  }
    and it actually didn't work,

    what I am doubting is, because I am having three threads, and currently tetsing on 3 MPI processes simulated in the same machine, that when I post the semaphore in one process, it increments the value, and when I post the same semaphore in another processes it increments its value again and interrupt the the sem_wait on another thread in another process,

    My understanding is that, a process has an isolated memory space, and its semaphores status are kept within, but I think simulating three processes on the same machine, violated this concept, and all processes updated the same semaphore, and this will disturb the design completely

    I am testing that on a 2.6.20-1.2944.fc6 #1 SMP Tue Apr 10 17:27:49 EDT 2007 i686 i686 i386 GNU/Linux

    and the gcc version is: gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-51)
    the MPICH library details are:
    Version: 1.0.4-rc1
    Device: ch3:sock
    Configure Options: '-prefix=/home/mhelal/mpich2-install' '--enable-sharedlibs=gcc' '--enable-mpe'


    I just tried running the same program, not simulated on a single machine, but on a real HPC machine, to test the possibility that semaphore retain its status across processes, not only across threads within the same process, and I received the same problem, " The call was interrupted by a signal handler", on sem_wait, and sem_post increase the value to over 1,

    The HPC machine is:
    2.6.5-7.199-sn2 #1 SMP Thu Aug 18 09:17:57 UTC 2005 ia64 ia64 ia64 GNU/Linux

    gcc (GCC) 3.3.3 (SuSE Linux)

    but I acually use here icc 8.1

    and most probably the mpi is mpiBLAST 1.4.0

    I appreciate your help a lot,

    Kind Regards,

    Manal

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Semaphores, need advice on implementation.
    By Swerve in forum C++ Programming
    Replies: 2
    Last Post: 01-13-2009, 01:54 AM
  2. No clue how to make a code to solve problems!
    By ctnzn in forum C Programming
    Replies: 8
    Last Post: 10-16-2008, 02:59 AM
  3. semaphores
    By Dr Spud in forum C Programming
    Replies: 7
    Last Post: 09-22-2007, 12:45 PM
  4. Problems with POSIX semaphores.
    By Volanin in forum Linux Programming
    Replies: 2
    Last Post: 07-13-2007, 02:32 PM
  5. Rendering problems (DirectX?)
    By OnionKnight in forum Tech Board
    Replies: 0
    Last Post: 08-17-2006, 12:17 PM