When a thread or process attempts to lock the mutex, it first gains the spinlock. Then it checks the lock count -- if the lock is unlocked, it increments the count, then unlocks the spinlock. On the other hand, if the lock was locked, the process puts itself on the wait queue, increments the wait count, then unlocks the spinlock while simultaneously sleeping. On Linux, that last step is achieved with a futex.Code:
volatile sig_atomic_t spinlock;
When the process holding the mutex unlocks it, it first locks the spin lock, then decrements the lock_count. If the lock_count becomes zero, it checks the wait count (while still holding the spinlock). If it's greater than zero, it dequeues all the processes on the wait queue and wakes them up, sets the wait_count to zero, decrements the lock_count, then unlocks the spinlock.
Being able to do this across processes depends on having a method of placing other processes on the wait queue, and telling them to wake up. On Linux, this is done with futexes. Not all implementations of pthreads support inter-process mutexes, but Linux does.
>> Does anyone forsee any problem with this...
Not sure what "this" is at this point :)
You can continue to use signals, just replace the signal handler with "g_time_to_die = 1". Then the controlling loop in dispatcher() would be "while (!g_time_to_die)", or something like that.
If you just call _exit() in your signal handler, that's not a "clean" way to die (no atexit handlers are called, resources aren't manually released, etc...). A clean termination would be ideal.
yeah...sorry about the vagueness of "this". I was referring to just having the signal handler make the child call _exit when it recieves the signal and then resetting the shared counter back to 0.
I know it would be better if they exited normally, but my dispatcher isn't in a loop. Each request that comes in is it's own process and then exits... So the dispatcher right now just takes the request (as the child) determines what kind of request it is, calls the correct function to handle the request and thats about it. So short of checking g_time_to_die between various points of all the functions after my dispatcher is called I'm not seeing another way to truly interrupt and cleanly have the childred cleanly exit.
I was also unaware of the atexit function (thanks, that will come in useful later on), and so am not calling any other functions on normal exit. I did notice that _exit closes open file descriptors etc.. which should be enough for what the children are doing.
As of now I've been able to successfully load the crap out my hobbled together code and no crashes/problems at all (knock on wood). I may still use a semaphore to wait for the max processes to back down so there is an open slot, and if it waits too long then send the kill with the assumption that all of the currently running children are hosed. But thats for later.
Thank you both for all of the great information and insight into what was happening. I don't think I ever would have thought it was problem with my signal handler. Thanks again.