Hello
I am using a simple sunchronization between threads, one compute, the second send dependency to other processors, and the thread received dependency from other processors, I am using MPI to send/receive between processors, and in each process, these three threads to do the inner computation of the parts done by each one,
The problem is in the threads syncronization, I use semaphores, I wait on the sending thread till there is a dependency that needs to be sent, and on the receiving thread till the computation require a dependency and need to check the MPI to receive it, I have three semaphores, one between the computation and sending, one between the computation and receiving, and one between the receiving and the computation to notify that it received something to be checked,
I initialize the three semaphores as :
Code:
if (sem_init(&icSem, 0, 0) != 0 ) {
printf ("Error Initializing Semephore icSem, Exiting\n");
getSemError ();
return;
}
and post like that:
Code:
if (sem_post(&rcSem) != 0) {
printf ("Error post Semephore rcSem, Exiting\n");
getSemError ();
return NULL;
}
and wait like this:
Code:
if (sem_wait(&dsSem) != 0) {
printf ("Error waiting on Semephore dsSem, Exiting\n");
getSemError ();
return NULL;
}
in the getSemError () I read the error and type the message as in the man files,
then I get this in the run time:
The call was interrupted by a signal handler. when I wait on the computation thread for the receiving thread to receive a dependency that I need to resume computing,and same problem happen when I wait in the sending thread on another semaphore, it fails to wait with the same error, practically on all sem_wait calls,
and when I read the value of the semaphore, it sometimes exceeds 1, I read values like 2 and 3 as well, while on every sem_post, there is a previously called sem_wait that once receives it should work on it,
I used semaphores, because I thought semaphores signals remain in memory, and order of calls don't affect execution, I previously tried pthread conditions (pthread_cond_wait / pthread_cond_signal), and it didn't work, and I learned that if I signal and nothing is waiting, the signal will disappear, and semaphores solved this problem, as I can wait either before or after the post signal, and the wait will block only if the semaphore value is zero, otherwise (i.e. a post was already received) it will decrement and come back
now, this understanding doesn't seem working either with semaphore,
I appreciate any help in this issue, as it has taken so much time from me, and I am afraid I might be in a wrong direction completely,
I appreciate a redirection to another forum (preferebly by the developers of the semaphores library for ANSI C in linux and/or pthreads library) or other concurrency gurus,
Thank you very much,
Manal