[torqueusers] pbs_server segfaults

Jason Allen jallen at fnal.gov
Wed Nov 3 11:27:59 MST 2004


We just upgraded our 225 node cluster from torque-1.1.0p1 to
torque-1.1.0p4 and we are now seeing the pbs_server process crash
intermittently. We currently have about 300 jobs in the queue and the
server dies every 2 - 30 mins.

After running pbs_server in gdb it looks like there is a problem
handling job requests.
 
Program received signal SIGSEGV, Segmentation fault.
0x0805f204 in req_register (preq=0x96fd0c8) at req_register.c:498
498         if (pjob->ji_modified)
(gdb) quit


Has anyone else seen this? 

Thanks!

Jason Allen
Fermilab 


More information about the torqueusers mailing list