I have a program that uses fork to generate a number of child processes. Without going into too much detail it generates 19 child processes that go off and do their thing and exit when they have finished.
There is a wait statement (below), but it does not seem to be working as the original programmer intended as it leaves behind a number of zombies and the sysadmins are complaining (even thought its not really a problem because the zombies do all die when the parent process terminates so it doesnt actually fill the process table and they have never had to reboot the box due to this problem in the 12 years that the program has been running)
the wait line is
Now I have been reading up on this and my understanding of the waitis statement is:
the P_ALL says wait for a change in state of any of the children spawned by the parent. My understanding of the WNOHANG bit is to tell it not to wait.
Unfortunately most of the examples I have found use only one parent and child so the wait is fairly simple for that.
I think the programmer assumed that each of the child processes will have terminated by this stage, and this is where I think the problem is. It only looks for a change in state of one of its child processes, not for all of them, and I assume because it does not test each child process for termination, they sit around waiting for their turn to pass their return code which never happens until the program terminates.
So would one solution be if I were to put the waitid into a loop build on an array of child process ids and test each id in turn?
How about if I rewrite the fork bit to use threads, would that be a cleaner solution?