[torquedev] Deadlock in 2.3.0

Åke Sandgren ake.sandgren at hpc2n.umu.se
Tue May 20 01:56:48 MDT 2008


Hi!

I just discovered a deadlock problem in 2.3.0.
I suggest a blocking SIGALARM in log_record.
Or not calling log_record in toolong.

Which one is the preferred solution?


The stacktrace is:
#0  0x00007fb79fff223e in __lll_lock_wait_private () from /lib/libc.so.6
#1  0x00007fb79ff9d37d in _L_lock_1911 () from /lib/libc.so.6
#2  0x00007fb79ff9d156 in __tz_convert () from /lib/libc.so.6
#3  0x00007fb7a029344d in log_record (eventtype=-1467024352, objclass=0,
    objname=0x75007fb700000001 <Address 0x75007fb700000001 out of
bounds>,
    text=0x447172 "d") at ../Liblog/pbs_log.c:442
#4  0x0000000000414365 in toolong (sig=-1467024328)

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_main.c:3678
#5  <signal handler called>
#6  0x00007fb79ffd6a75 in _xstat () from /lib/libc.so.6
#7  0x00007fb79ff9dbe8 in __tzfile_read () from /lib/libc.so.6
#8  0x00007fb79ff9cf5a in tzset_internal () from /lib/libc.so.6
#9  0x00007fb79ff9d177 in __tz_convert () from /lib/libc.so.6
#10 0x00007fb7a029344d in log_record (eventtype=-1467023360,
objclass=4265871,
    objname=0x440380 "nusers", text=0x7fffa88f0048 "Ì\r\217¨ÿ\177")
    at ../Liblog/pbs_log.c:442
#11 0x0000000000432bed in totmem (attrib=0x4407f2)

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/linux/mom_mach.c:3248
#12 0x0000000000411842 in dependent (res=0x7fb7a029af30 "", attr=0x22)

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_main.c:1305
#13 0x000000000041a634 in gen_gen (name=0x0, BPtr=0x4832714c,
BSpace=0x0)

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_server.c:848
#14 0x000000000041a705 in generate_server_status (
    buffer=0x4832714b <Address 0x4832714b out of bounds>, buffer_size=0)

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_server.c:922
#15 0x000000000041a90e in mom_server_all_update_stat ()

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_server.c:1014
#16 0x000000000041948d in main_loop ()

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_main.c:7331
#17 0x00000000004198ab in main (argc=-1467023296, argv=0x0)

at /afs/hpc2n.umu.se/lap/torque/2.3.0/src/torque-2.3.0/amd64_ubuntu804/src/resmom/mom_main.c:7471


-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: ake at hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



More information about the torquedev mailing list