Signal and exception handling

**nts** · 11-15-2007

Originally Posted by brewbuck

Nothing should be calling it. I don't know of any C library routines which call abort(). If you're not calling it, nobody is calling it.

assert() does. But then assert could also be easily replaced my myassert() which in my code would unfortunately only solve a small part of the problem.

Originally Posted by brewbuck

Under UNIX, the result of ignoring a SIGSEGV signal which was not generated with raise() or kill() is undefined. It will probably lead to an infinite loop. Here's what happens:

1. Code attempts to access bogus memory location
2. Processor signal interrupt, calls SIGSEGV handler
3. Handler returns, CPU goes to the same instruction again
4. Go to step 1

Oh, it goes to the same instruction? That may make sense from the point of view of the OS but here it creates trouble. So, in that case any signal handler that sets a flag should not return, or should it? But if it waits and a different thread throws an exception, our thread will not go to the catch either, or will it?!

Nils.

**brewbuck** · 11-15-2007

Originally Posted by nts

Oh, it goes to the same instruction? That may make sense from the point of view of the OS but here it creates trouble.

What other behavior would make any sense? The idea is to trap a bad memory access, do something that hopefully fixes the situation, then try again. If it just moved on to the next instruction, then that memory access acts like it never existed in the first place, which means your program has just become a big mess of "undefined."

So, in that case any signal handler that sets a flag should not return, or should it? But if it waits and a different thread throws an exception, our thread will not go to the catch either, or will it?!

Exceptions only propagate within a single thread. So the exception has to be thrown by the same thread which will ultimately handle it. As for whether the signal handler should return... It must always return. How could it not?

**nts** · 11-15-2007

Originally Posted by brewbuck

What other behavior would make any sense? The idea is to trap a bad memory access, do something that hopefully fixes the situation, then try again. If it just moved on to the next instruction, then that memory access acts like it never existed in the first place, which means your program has just become a big mess of "undefined."

Yes, that's what I meant, from an OS point of view it does make sense. From a programmer's point of view retrying that instruction would mean that the source of the problem must be fixed first. Since we are not dealing with a page fault but with something unknown in an unknown (to the signal handler) context it would be better for us to return elsewhere... But that would require contextual information and cannot be guessed by the compiler, I do realise that.

Originally Posted by brewbuck

Exceptions only propagate within a single thread. So the exception has to be thrown by the same thread which will ultimately handle it. As for whether the signal handler should return... It must always return. How could it not?

Well, since the data is apparently faulty we really do not want it to return to the method where the signal occurred in order to avoid an infinite signal/catcher loop. Instead we really want the IP to point to the catch() clause or some other defined position. Maybe with a long jump we could do that, but in each stack state (i.e. method, possibly in each for loop etc?) we would then need to register the address to jump to so that the signal handler would only jump within the "same stack state" location. Sounds like our own try-catch construct, too bad the real one does not do it :-(

OK, I will try something but tomorrow -- right now my head is spinning...

**CornedBee** · 11-15-2007

How can a signal handler that was called by abort() return?

Let me reword: they must return normally or not at all.

GCC's exception implementation uses DWARF information, the instruction pointer and a stack deconstruction for unwinding. So yes, the stack definitely needs to be clean.

Can't you write code that checks the datum's correctness (as in, won't cause the lib to crash) before passing it in?

**brewbuck** · 11-15-2007

Originally Posted by nts

OK, I will try something but tomorrow -- right now my head is spinning...

You say that the model here is: Test piece of data. If fault caught, discard data and try something else. Why not test each piece of data in its own thread. If the thread crashes, who cares? You don't have to create a new thread for every single piece of data you test. Just make one, go into your loop, and run until you crash, logging your progress all the while. When the crash inevitably occurs, spawn a new thread and start your loop again where you left off.

**nts** · 11-15-2007

Originally Posted by CornedBee

GCC's exception implementation uses DWARF information, the instruction pointer and a stack deconstruction for unwinding. So yes, the stack definitely needs to be clean.

I suspected so, looking at the backtrace (posted above). That's unfortunate for us.

Originally Posted by CornedBee

Can't you write code that checks the datum's correctness (as in, won't cause the lib to crash) before passing it in?

The thing is, the datum really is OK but the lib code is not. The lib code will work with most data but in 0.5% of the cases there will be an otherwise valid datum that makes it crash.

Thinking about my possibilities I may try to modify the lib to at least throw exceptions (directly!) instead of doing [things that lead to] aborts, even if I do not have the time to debug it completely.

Thanks again everyone for your input!!!

Kind regards,

Nils.

**nts** · 11-15-2007

Originally Posted by brewbuck

You say that the model here is: Test piece of data. If fault caught, discard data and try something else. Why not test each piece of data in its own thread. If the thread crashes, who cares? You don't have to create a new thread for every single piece of data you test. Just make one, go into your loop, and run until you crash, logging your progress all the while. When the crash inevitably occurs, spawn a new thread and start your loop again where you left off.

Brewbuck, you're a genius! That sounds like a great idea. It may even run faster if I parallelise it :-)

Thanks!

**brewbuck** · 11-15-2007

Originally Posted by nts

Brewbuck, you're a genius! That sounds like a great idea. It may even run faster if I parallelise it :-)

Thanks!

The way signals work in POSIX threads means you have to do some trickery to make the SIGSEGV actually kill the proper thread. Without some special code it will simply kill the entire process including all its threads. I wrote up a small example which shows how it might work.

Code:

#include <signal.h>
#include <pthread.h>
#include <stdio.h>

pthread_t thr;

void do_segv(int x);

void *thread_func(void *arg)
{
   int *null = NULL;
   int x;

   pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
   pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);

   x = *null;
   return NULL;
}

void do_segv(int x)
{
   pthread_cancel(thr);
}

int main()
{
   signal(SIGSEGV, do_segv);

   for(;;)
   {
      pthread_create(&thr, NULL, thread_func, NULL);
      pthread_join(thr, NULL);
      printf("Thread died\n");
   }
}

This loops forever, spawning new threads then waiting for them to die. The thread which is spawned immediately causes a segfault by dereferencing a NULL pointer. This triggers a SIGSEGV, which is delivered to do_segv(). do_segv() calls pthread_cancel() to kill the thread.

In order for cancellation to work as we want here, the two calls to pthread_setcancelstate() and pthread_setcanceltype() are necessary. Otherwise delivery of the cancellation will only occur at a "cancellation point." But we want it to die right away.

Note the need for a global variable which holds the thread handle. This is so the signal handler can access it. It doesn't need to be volatile, since the signal handler doesn't change its value.

There are drawbacks. If some other part of your app throws a SIGSEGV, it will kill the wrong thread. You can probably imagine ways to deal with that.

**CornedBee** · 11-15-2007

Or don't use threads. Just fork a new process and test the exit code.

Thread: Signal and exception handling

Thread Tools

Search Thread

Display

Thanks to all of you...

Great!

Similar Threads

Signal handler function - pointer to this gets lost

signal handling and exception handling

Atomic Operations

POSIX Signal Handling