[torqueusers] 4.1.x leftover problems

Joerg Blank j.blank at fz-juelich.de
Thu Mar 21 10:26:36 MDT 2013


Hello everyone,

we are currently still experiencing 0-10 crashes per day from two causes:

1.) There is a double free in the handling of attrlists.
2.) It seems that sometimes information about the mywork variable in
work_thread (u_threadpool.c) gets corrupted, which leads to a subsequent
crash on the free call, when closing down a thread. I suspect the thread
shutdown has to be guarded by a mutex.

Regards,
Jörg Blank


#0  0x00007f44d3e34b23 in
tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
unsigned long, int) () from /usr/lib/libtcmalloc.so
(gdb) bt
#0  0x00007f44d3e34b23 in
tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,
unsigned long, int) () from /usr/lib/libtcmalloc.so
#1  0x00007f44d3e34f67 in tcmalloc::ThreadCache::Scavenge() () from
/usr/lib/libtcmalloc.so
#2  0x00007f44d3e41685 in tc_free () from /usr/lib/libtcmalloc.so
#3  0x000000000046ef6c in free_attrlist (pattrlisthead=0xaa2df38) at
attr_func.c:422
#4  0x0000000000431542 in reply_free (prep=0x8802e88) at reply_send.c:300
#5  0x000000000042f269 in free_br (preq=0x8802a00) at process_request.c:1080
#6  0x0000000000431378 in reply_send_svr (request=0x8802a00) at
reply_send.c:197
#7  0x00000000004504a4 in sel_step3 (cntl=0xb1e0d80) at req_select.c:670
#8  0x000000000044fbe7 in req_selectjobs (preq=0x8802a00) at
req_select.c:351
#9  0x000000000042ee51 in dispatch_request (sfds=7, request=0x8802a00)
at process_request.c:869
#10 0x000000000042e942 in process_request (chan=0x7a48ba0) at
process_request.c:662
#11 0x0000000000429f54 in process_pbs_server_port (sock=7,
is_scheduler_port=0) at pbsd_main.c:402
#12 0x000000000042a1b3 in start_process_pbs_server_port
(new_sock=0x6a1fbc0) at pbsd_main.c:533
#13 0x000000000047373e in work_thread (a=0x7fff7d872480) at
u_threadpool.c:307
#14 0x00007f44d29e48ca in start_thread (arg=<value optimized out>) at
pthread_create.c:300
#15 0x00007f44d2543b6d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#16 0x0000000000000000 in ?? ()




More information about the torqueusers mailing list