exit() call pthread_mutex_lock()

**tux_mind** · 09-10-2012

hi all,
i my C program when i call exit() it calls pthread_mutex_lock() without any reason.
thus send the prcess in deadlock....
for the backtrace:
c - exit() call pthread_mutex_lock() - Stack Overflow

thanks in advance.

**grumpy** · 09-10-2012

The most likely explanation is that some (apparently unrelated) function in your code (or in the "network checker" code you refer to) has molested a pointer, and screwed up the internal workings of exit() or the startup/termination code generated by your compiler.

In other words, I agree with the answer you got in that stackoverflow thread.

These sorts of things are rarely due to bugs in compilers or libraries (despite the fact that is the first possibility many programmers consider). They are usually due to bugs in user code.

**tux_mind** · 09-11-2012

Originally Posted by grumpy

The most likely explanation is that some (apparently unrelated) function in your code (or in the "network checker" code you refer to) has molested a pointer

Ad you can see in the source code, the network check is never spawned if there isn't valid options.
So...alla the code beign executed is in Main & option_handler...where i cannot find anything that can write on bad pointers or something like.
Do you see anything bad in these 2 functions??

**oogabooga** · 09-11-2012

Originally Posted by tux_mind

Do you see anything bad in these 2 functions??

Those aren't the only functions! Before option_handler is called, you call:
report_error() (a.k.a., w_report_error())
and P_path() (a.k.a., parser_path())
So maybe you're stomping on memory in one of those.

And, strangely, autocrack.h includes .c files for some reason:

Code:

#include "common/common.c"
#include "parser/parser.c"
#include "engine/engine.c"

Presumably you mean those to be .h.

**anduril462** · 09-11-2012

Also, you call destroy_all() in autocrack.c, on line 301, right before you call exit. That function messes with threads, mutexes, linked lists and god knows what else. I think it's a likely candidate for trashing memory. Try commenting it out and simply calling exit to see if the immediate problem goes away. Then at least you can start narrowing it down.

If you can create something this complex, you should really start getting intimate with gdb if you haven't yet.

**rags_to_riches** · 09-11-2012

If you can create something this complex, you should really start getting intimate with gdb if you haven't yet.

valgrind too

**Salem** · 09-11-2012

An example.

Code:

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

pthread_mutex_t mymutex = PTHREAD_MUTEX_INITIALIZER;

void *blocker ( void *p ) {
    printf("Blocker=%lx\n", (unsigned long)pthread_self() );
    pthread_mutex_lock(&mymutex);
    // Naughty, bombing out with a lock!
    pthread_exit(NULL);
}

void *waiter ( void *p ) {
    printf("Waiter=%lx\n", (unsigned long)pthread_self() );
    pthread_mutex_lock(&mymutex);
    pthread_mutex_unlock(&mymutex);
    pthread_exit(NULL);
}

int main ( ) {
    pthread_t   t1, t2;
    void        *result;
    
    pthread_create(&t1,NULL,blocker,NULL);
    pthread_join(t1, &result);
    pthread_create(&t2,NULL,waiter,NULL);
    pthread_join(t2, &result);
    pthread_exit(NULL);
    return 0;
}

And a gdb session.

Code:

(gdb) run
Starting program: /home/sc/Documents/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff781d700 (LWP 2788)]
Blocker=7ffff781d700
[Thread 0x7ffff781d700 (LWP 2788) exited]
[New Thread 0x7ffff781d700 (LWP 2789)]
Waiter=7ffff781d700
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7bc81f8 in pthread_join (threadid=140737345869568, thread_return=0x7fffffffe188) at pthread_join.c:89
89	pthread_join.c: No such file or directory.
	in pthread_join.c
(gdb) info threads
  Id   Target Id         Frame 
  3    Thread 0x7ffff781d700 (LWP 2789) "a.out" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
* 1    Thread 0x7ffff7fd2720 (LWP 2785) "a.out" 0x00007ffff7bc81f8 in pthread_join (threadid=140737345869568, thread_return=0x7fffffffe188) at pthread_join.c:89
(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff781d700 (LWP 2789))]
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
136	../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
	in ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S
(gdb) bt
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007ffff7bc91e5 in _L_lock_883 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007ffff7bc903a in __pthread_mutex_lock (mutex=0x601080) at pthread_mutex_lock.c:61
#3  0x00000000004007b4 in waiter (p=0x0) at foo.c:16
#4  0x00007ffff7bc6efc in start_thread (arg=0x7ffff781d700) at pthread_create.c:304
#5  0x00007ffff790159d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6  0x0000000000000000 in ?? ()
(gdb) frame 2
#2  0x00007ffff7bc903a in __pthread_mutex_lock (mutex=0x601080) at pthread_mutex_lock.c:61
61	pthread_mutex_lock.c: No such file or directory.
	in pthread_mutex_lock.c
(gdb) print *mutex
$1 = {__data = {__lock = 2, __count = 0, __owner = 2788, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, 
  __size = "\002\000\000\000\000\000\000\000\344\n\000\000\001", '\000' <repeats 26 times>, __align = 2}
(gdb)

Here you can see in the mutex object itself is a field indicating which thread currently owns the mutex. When you know this, you can either locate a thread which has already exited (as in this case), or a still living thread which still has the lock. Either way, you'll know the locking thread and should be closer to figuring out WHY it still has the lock.

**tux_mind** · 09-14-2012

Originally Posted by Salem

Here you can see in the mutex object itself is a field indicating which thread currently owns the mutex. When you know this, you can either locate a thread which has already exited (as in this case), or a still living thread which still has the lock. Either way, you'll know the locking thread and should be closer to figuring out WHY it still has the lock.

sorry for the late reply but i just give an exam and i have another on 18/09...
so, i'm quite familiar with GDB and valgrind but i cannot analyze the locked mutex since i don't known what it is.
i've tried to analyze local variables and arguments of the frame which call pthread_mutex_lock, but i cannot find anything, neither an address....
and valgrind freeze when the deadlock occurs.

Originally Posted by anduril462

Also, you call destroy_all() in autocrack.c, on line 301, right before you call exit. That function messes with threads, mutexes, linked lists and god knows what else. I think it's a likely candidate for trashing memory. Try commenting it out and simply calling exit to see if the immediate problem goes away. Then at least you can start narrowing it down.

If you can create something this complex, you should really start getting intimate with gdb if you haven't yet.

i will try in this evening, really, sorry for the late reply.

anyway thanks to all for the replies, i asked to many mailinglists and they simply tell me to use return or the _exit() function... i wanna known WHY this "bug" happen, not a workaround!

PS:
sorry for my doggish english! lol

**Salem** · 09-14-2012

> (gdb) print *mutex
You do this, to find out who owns it.

> (gdb) info threads
You do this, to find out if the owner is alive or dead.

If the owner is alive, then you can do
(gdb) thread n
(gdb) bt
to find out what it is up to at the moment.
Since it is deadlocked, it would seem that it's blocked waiting for something to happen.

If the owner is dead (doesn't appear in info threads), then you need to do some more detective work.
Are thread ID's consistent across runs? If so, then perhaps you can break on the creation of the interesting thread, and find out where it locks this mutex.

**tux_mind** · 09-15-2012

so, changing these lines

Code:

            if(option_index < NET_CHK_TIMEOUT)
            {
                report(debug,"I've wait internet check for %d ms.",option_index);
                pthread_join(globals.tpool->thread, (void **) &option_index);
            }
            else
                option_index = ETIMEDOUT;

to

Code:

            pthread_join(globals.tpool->thread, (void **) &option_index);
            if(option_index < NET_CHK_TIMEOUT)
                report(debug,"I've wait internet check for %d ms.",option_index);
            else
                option_index = ETIMEDOUT;

in autocrack.c reduces tens of times the reproducibility of the bug.

i will try to remake the same network state as the first time the bug is appear ( router connected to modem but modem not connected to ADSL signal ) since the back trace is changed:

Code:

Program received signal SIGINT, Interrupt.
0x00007ffff6e6b3ae in __lll_lock_wait_private () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff6e6b3ae in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00007ffff6e012bf in _L_lock_6442 () from /lib64/libc.so.6
#2  0x00007ffff6dff6d1 in free () from /lib64/libc.so.6
#3  0x0000000000413e8f in option_handler (argc=2, argv=0x7fffffffdba8) at autocrack.c:289
#4  0x00000000004135a7 in main (argc=2, argv=0x7fffffffdba8) at autocrack.c:34
(gdb) i thr
  Id   Target Id         Frame 
* 1    Thread 0x7ffff7fc9700 (LWP 5525) "autocrack" 0x00007ffff6e6b3ae in __lll_lock_wait_private () from /lib64/libc.so.6
(gdb) up
#1  0x00007ffff6e012bf in _L_lock_6442 () from /lib64/libc.so.6
(gdb) i locals
No symbol table info available.
(gdb) i args
No symbol table info available.
(gdb) up
#2  0x00007ffff6dff6d1 in free () from /lib64/libc.so.6
(gdb) i args
No symbol table info available.
(gdb) i locals
No symbol table info available.
(gdb) x/dw &pool_lock.__data.__owner
0x61c2a8 <pool_lock+8>: 0
(gdb) x/dw &pool_lock.__data.__lock 
0x61c2a0 <pool_lock>:   0
(gdb) x/xw &globals.hash_list
0x61c278 <globals+120>: 0x00000000
(gdb) x/xw &globals.wpa_list
0x61c268 <globals+104>: 0x00000000
(gdb) x/xw &globals.tpool->lock.__data.__owner
0x61e0d8:       0x00000000
(gdb) x/xw &globals.tpool->lock.__data.__lock 
0x61e0d0:       0x00000000
(gdb) x/xw &globals.tpool->next                    
0x61e0f8:       0x00000000

the only pthread_mutex_t in this program are "pool_lock" as global variable, "lock" in the "_hash" struct and "lock" inside the "t_info" struct.
as you can see in the GDB session mutex hasn't owners, hash_list is empty end the mutex of the only item of the thread pool isn't locked.
as you can see in line 130:

Code:

globals.tpool = malloc(sizeof(struct t_info));

tpool is set by malloc, which ( thanks to macros ) is replaced with a mine function that set all bytes to '0'. so also the mutex is initialized, right?!

I will update you this afternoon.

Thread: exit() call pthread_mutex_lock()

Thread Tools

Search Thread

Display

exit() call pthread_mutex_lock()

here i am

Similar Threads

Call for an exit in C

priority inversion due to pthread_mutex_lock(): how to avoid?

do I really have to do pthread_mutex_lock?

what exactly does pthread_mutex_lock() ?

C system call and library call

Tags for this Thread