hi all,
i my C program when i call exit() it calls pthread_mutex_lock() without any reason.
thus send the prcess in deadlock....
for the backtrace:
c - exit() call pthread_mutex_lock() - Stack Overflow
thanks in advance.
hi all,
i my C program when i call exit() it calls pthread_mutex_lock() without any reason.
thus send the prcess in deadlock....
for the backtrace:
c - exit() call pthread_mutex_lock() - Stack Overflow
thanks in advance.
The most likely explanation is that some (apparently unrelated) function in your code (or in the "network checker" code you refer to) has molested a pointer, and screwed up the internal workings of exit() or the startup/termination code generated by your compiler.
In other words, I agree with the answer you got in that stackoverflow thread.
These sorts of things are rarely due to bugs in compilers or libraries (despite the fact that is the first possibility many programmers consider). They are usually due to bugs in user code.
Ad you can see in the source code, the network check is never spawned if there isn't valid options.
So...alla the code beign executed is in Main & option_handler...where i cannot find anything that can write on bad pointers or something like.
Do you see anything bad in these 2 functions??
Those aren't the only functions! Before option_handler is called, you call:
report_error() (a.k.a., w_report_error())
and P_path() (a.k.a., parser_path())
So maybe you're stomping on memory in one of those.
And, strangely, autocrack.h includes .c files for some reason:
Presumably you mean those to be .h.Code:#include "common/common.c" #include "parser/parser.c" #include "engine/engine.c"
Last edited by oogabooga; 09-11-2012 at 02:03 AM.
The cost of software maintenance increases with the square of the programmer's creativity. - Robert D. Bliss
Also, you call destroy_all() in autocrack.c, on line 301, right before you call exit. That function messes with threads, mutexes, linked lists and god knows what else. I think it's a likely candidate for trashing memory. Try commenting it out and simply calling exit to see if the immediate problem goes away. Then at least you can start narrowing it down.
If you can create something this complex, you should really start getting intimate with gdb if you haven't yet.
valgrind tooIf you can create something this complex, you should really start getting intimate with gdb if you haven't yet.
An example.
And a gdb session.Code:#include <stdio.h> #include <pthread.h> #include <unistd.h> pthread_mutex_t mymutex = PTHREAD_MUTEX_INITIALIZER; void *blocker ( void *p ) { printf("Blocker=%lx\n", (unsigned long)pthread_self() ); pthread_mutex_lock(&mymutex); // Naughty, bombing out with a lock! pthread_exit(NULL); } void *waiter ( void *p ) { printf("Waiter=%lx\n", (unsigned long)pthread_self() ); pthread_mutex_lock(&mymutex); pthread_mutex_unlock(&mymutex); pthread_exit(NULL); } int main ( ) { pthread_t t1, t2; void *result; pthread_create(&t1,NULL,blocker,NULL); pthread_join(t1, &result); pthread_create(&t2,NULL,waiter,NULL); pthread_join(t2, &result); pthread_exit(NULL); return 0; }
Here you can see in the mutex object itself is a field indicating which thread currently owns the mutex. When you know this, you can either locate a thread which has already exited (as in this case), or a still living thread which still has the lock. Either way, you'll know the locking thread and should be closer to figuring out WHY it still has the lock.Code:(gdb) run Starting program: /home/sc/Documents/a.out [Thread debugging using libthread_db enabled] [New Thread 0x7ffff781d700 (LWP 2788)] Blocker=7ffff781d700 [Thread 0x7ffff781d700 (LWP 2788) exited] [New Thread 0x7ffff781d700 (LWP 2789)] Waiter=7ffff781d700 ^C Program received signal SIGINT, Interrupt. 0x00007ffff7bc81f8 in pthread_join (threadid=140737345869568, thread_return=0x7fffffffe188) at pthread_join.c:89 89 pthread_join.c: No such file or directory. in pthread_join.c (gdb) info threads Id Target Id Frame 3 Thread 0x7ffff781d700 (LWP 2789) "a.out" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 * 1 Thread 0x7ffff7fd2720 (LWP 2785) "a.out" 0x00007ffff7bc81f8 in pthread_join (threadid=140737345869568, thread_return=0x7fffffffe188) at pthread_join.c:89 (gdb) thread 3 [Switching to thread 3 (Thread 0x7ffff781d700 (LWP 2789))] #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 136 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory. in ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S (gdb) bt #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007ffff7bc91e5 in _L_lock_883 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007ffff7bc903a in __pthread_mutex_lock (mutex=0x601080) at pthread_mutex_lock.c:61 #3 0x00000000004007b4 in waiter (p=0x0) at foo.c:16 #4 0x00007ffff7bc6efc in start_thread (arg=0x7ffff781d700) at pthread_create.c:304 #5 0x00007ffff790159d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #6 0x0000000000000000 in ?? () (gdb) frame 2 #2 0x00007ffff7bc903a in __pthread_mutex_lock (mutex=0x601080) at pthread_mutex_lock.c:61 61 pthread_mutex_lock.c: No such file or directory. in pthread_mutex_lock.c (gdb) print *mutex $1 = {__data = {__lock = 2, __count = 0, __owner = 2788, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\344\n\000\000\001", '\000' <repeats 26 times>, __align = 2} (gdb)
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
sorry for the late reply but i just give an exam and i have another on 18/09...
so, i'm quite familiar with GDB and valgrind but i cannot analyze the locked mutex since i don't known what it is.
i've tried to analyze local variables and arguments of the frame which call pthread_mutex_lock, but i cannot find anything, neither an address....
and valgrind freeze when the deadlock occurs.
i will try in this evening, really, sorry for the late reply.
anyway thanks to all for the replies, i asked to many mailinglists and they simply tell me to use return or the _exit() function... i wanna known WHY this "bug" happen, not a workaround!
PS:
sorry for my doggish english! lol
> (gdb) print *mutex
You do this, to find out who owns it.
> (gdb) info threads
You do this, to find out if the owner is alive or dead.
If the owner is alive, then you can do
(gdb) thread n
(gdb) bt
to find out what it is up to at the moment.
Since it is deadlocked, it would seem that it's blocked waiting for something to happen.
If the owner is dead (doesn't appear in info threads), then you need to do some more detective work.
Are thread ID's consistent across runs? If so, then perhaps you can break on the creation of the interesting thread, and find out where it locks this mutex.
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
so, changing these lines
toCode:if(option_index < NET_CHK_TIMEOUT) { report(debug,"I've wait internet check for %d ms.",option_index); pthread_join(globals.tpool->thread, (void **) &option_index); } else option_index = ETIMEDOUT;
in autocrack.c reduces tens of times the reproducibility of the bug.Code:pthread_join(globals.tpool->thread, (void **) &option_index); if(option_index < NET_CHK_TIMEOUT) report(debug,"I've wait internet check for %d ms.",option_index); else option_index = ETIMEDOUT;
i will try to remake the same network state as the first time the bug is appear ( router connected to modem but modem not connected to ADSL signal ) since the back trace is changed:
the only pthread_mutex_t in this program are "pool_lock" as global variable, "lock" in the "_hash" struct and "lock" inside the "t_info" struct.Code:Program received signal SIGINT, Interrupt. 0x00007ffff6e6b3ae in __lll_lock_wait_private () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff6e6b3ae in __lll_lock_wait_private () from /lib64/libc.so.6 #1 0x00007ffff6e012bf in _L_lock_6442 () from /lib64/libc.so.6 #2 0x00007ffff6dff6d1 in free () from /lib64/libc.so.6 #3 0x0000000000413e8f in option_handler (argc=2, argv=0x7fffffffdba8) at autocrack.c:289 #4 0x00000000004135a7 in main (argc=2, argv=0x7fffffffdba8) at autocrack.c:34 (gdb) i thr Id Target Id Frame * 1 Thread 0x7ffff7fc9700 (LWP 5525) "autocrack" 0x00007ffff6e6b3ae in __lll_lock_wait_private () from /lib64/libc.so.6 (gdb) up #1 0x00007ffff6e012bf in _L_lock_6442 () from /lib64/libc.so.6 (gdb) i locals No symbol table info available. (gdb) i args No symbol table info available. (gdb) up #2 0x00007ffff6dff6d1 in free () from /lib64/libc.so.6 (gdb) i args No symbol table info available. (gdb) i locals No symbol table info available. (gdb) x/dw &pool_lock.__data.__owner 0x61c2a8 <pool_lock+8>: 0 (gdb) x/dw &pool_lock.__data.__lock 0x61c2a0 <pool_lock>: 0 (gdb) x/xw &globals.hash_list 0x61c278 <globals+120>: 0x00000000 (gdb) x/xw &globals.wpa_list 0x61c268 <globals+104>: 0x00000000 (gdb) x/xw &globals.tpool->lock.__data.__owner 0x61e0d8: 0x00000000 (gdb) x/xw &globals.tpool->lock.__data.__lock 0x61e0d0: 0x00000000 (gdb) x/xw &globals.tpool->next 0x61e0f8: 0x00000000
as you can see in the GDB session mutex hasn't owners, hash_list is empty end the mutex of the only item of the thread pool isn't locked.
as you can see in line 130:
tpool is set by malloc, which ( thanks to macros ) is replaced with a mine function that set all bytes to '0'. so also the mutex is initialized, right?!Code:globals.tpool = malloc(sizeof(struct t_info));
I will update you this afternoon.