[torquedev] pbs_server [torque-2.1.8] crash in adm64

Rajiv Chittajallu rajive at ieee.org
Thu Jun 28 10:45:45 MDT 2007


The pbs_server on one of our amd64 boxes is occasionally crashing. Some of the
nodes are running 32bit pbs_mom. Everything works fine after the restart. 

Did anyone notice similar failures? Here is the backtrace. 

Jun 28 13:09:52 node0  pbs_server[16596]: segfault at 0000000000000000 rip
0000002a9582a513 rsp 0000007fbffff0e8 error 4

(gdb) bt
#0  0x0000002a9582a513 in strstr () from /lib64/tls/libc.so.6
#1  0x000000000040b042 in sync_node_jobs (np=0x12bfca0, jobstring_in=Variable
"jobstring_in" is not available.
) at node_manager.c:828
#2  0x000000000040b4d6 in is_stat_get (np=0x12bfca0) at node_manager.c:1169
#3  0x000000000040c36c in is_request (stream=1348, version=Variable "version"
is not available.
) at node_manager.c:1940
#4  0x0000000000410330 in do_rpp (stream=1348) at pbsd_main.c:317
#5  0x00000000004103e2 in rpp_request (fd=0) at pbsd_main.c:363
#6  0x0000002a95689c41 in wait_request (waittime=Variable "waittime" is not
available.
) at ../Libnet/net_server.c:320
#7  0x00000000004113eb in main (argc=Variable "argc" is not available.
) at pbsd_main.c:1123
(gdb) 

Thanks,
Rajiv


More information about the torquedev mailing list