Reaping zombies with sigaction() [Archive] - C Board

PDA

View Full Version : Reaping zombies with sigaction()


heras
03-12-2008, 06:41 AM
Hi,
In an example from Beej's guide to network programming (http://beej.us/guide/bgnet/output/html/multipage/clientserver.html#simpleserver) there's the following piece of server code:
void sigchld_handler(int s)
{
while(waitpid(-1, NULL, WNOHANG) > 0);
}

...

sa.sa_handler = sigchld_handler; // reap all dead processes
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;
if (sigaction(SIGCHLD, &sa, NULL) == -1) {
perror("sigaction");
exit(1);
}

while(1) { // main accept() loop
sin_size = sizeof their_addr;
if ((new_fd = accept(sockfd, (struct sockaddr *)&their_addr, \
&sin_size)) == -1) {
perror("accept");
continue;
}
printf("server: got connection from %s\n", \
inet_ntoa(their_addr.sin_addr));
if (!fork()) { // this is the child process
close(sockfd); // child doesn't need the listener
if (send(new_fd, "Hello, world!\n", 14, 0) == -1)
perror("send");
close(new_fd);
exit(0);
}
close(new_fd); // parent doesn't need this
}
I think I kind of understand how the //reap dead processes block itself works, but how is it ever triggered from within the while(1) loop where the works is being done? I can understand how any present zombie processes may get reaped prior to entering the while(1) loop but not after.

Also if (sigaction(SIGCHLD, &sa, NULL) fails, tell us about it and exit the server all together. Isn't that a bit extreme or is such a thing only supposed to happen to mall-configured services?
Thanks,
heras

edit: I think this may be linux specific so I put it here.

brewbuck
03-12-2008, 10:15 AM
I think I kind of understand how the //reap dead processes block itself works, but how is it ever triggered from within the while(1) loop where the works is being done? I can understand how any present zombie processes may get reaped prior to entering the while(1) loop but not after.

I'm not sure I entirely understand your question. When a child process dies, its parent receives a SIGCHLD signal, which invokes the signal handler asynchronously. Then it reaps as many processes as possible (probably only the single one that died, although more could die at the same instant, thus the loop). Then it returns to the main program. When another child dies, you get another SIGCHLD and it all happens again.

Also if (sigaction(SIGCHLD, &sa, NULL) fails, tell us about it and exit the server all together. Isn't that a bit extreme or is such a thing only supposed to happen to mall-configured services?

It is not extreme at all. What would be extreme would be continuing to run without the reaper handler, because the parent would slowly populate the system with zombie processes. Also, if a call to sigaction() fails, things are so seriously wrong that continuing would be pointless. sigaction() should not fail.

matsp
03-12-2008, 10:20 AM
It is not extreme at all. What would be extreme would be continuing to run without the reaper handler, because the parent would slowly populate the system with zombie processes. Also, if a call to sigaction() fails, things are so seriously wrong that continuing would be pointless. sigaction() should not fail.

sigaction should not fail under normal circumstances, but there is a chance that it does - otherwise it would not return an error code... It is always a good idea to check for error codes and do something "sensible" rather than just blindly hope that it worked. [This is called the "Ostrich method" for error detection, "Stick your head in the sand and hope the problem goes away".

--
Mats

brewbuck
03-12-2008, 10:25 AM
sigaction should not fail under normal circumstances, but there is a chance that it does - otherwise it would not return an error code... It is always a good idea to check for error codes and do something "sensible" rather than just blindly hope that it worked. [This is called the "Ostrich method" for error detection, "Stick your head in the sand and hope the problem goes away".


I absolutely agree. What I was trying to say is that calling exit() after a failed sigaction() is the only sensible thing. sigaction() can theoretically fail, but it shouldn't. If it does, it is catastrophic and you should terminate.

EDIT: I'm reading the man page for sigaction() and it says it can come back with EINTR. I would not have imagined that, but if true, the call to sigaction() should be wrapped in a loop to retry until it no longer gets EINTR. All system calls which return EINTR should be wrapped in such loops.

heras
03-12-2008, 01:05 PM
When a child process dies, its parent receives a SIGCHLD signal, which invokes the signal handler asynchronously.
Aha, I didn't realize this 'trigger' I was looking for is inherent to a dying child process and not explicitly in the code as such!

It is not extreme at all. What would be extreme would be continuing to run without the reaper handler, because the parent would slowly populate the system with zombie processes. Also, if a call to sigaction() fails, things are so seriously wrong that continuing would be pointless. sigaction() should not fail.
sigaction should not fail under normal circumstances, but there is a chance that it does - otherwise it would not return an error code... It is always a good idea to check for error codes and do something "sensible" rather than just blindly hope that it worked. [This is called the "Ostrich method" for error detection, "Stick your head in the sand and hope the problem goes away".
Well, that actually makes a lot of sense. I guess what users might sometimes call a "crash" a programmer might call a feature :)

Thank you both for your answers.

Ps.: I will study the sigaction() and related man pages more closely over the next few days.