[torqueusers] Undead job on node

Joerg Blank j.blank at fz-juelich.de
Fri Sep 16 08:43:27 MDT 2011


Hello everyone,

One of my colleagues dropped an array job with 1300 tasks on our
Torque2.5.8/Maui cluster. That nearly halted the scheduler, but also
created 2 nameless zombie jobs on 2 nodes.
Those 2 jobs do not appear in qstat and Maui, but appear to block a
processor, so every job scheduled on that slot gets deferred by Maui.

See the jobs line in this pbsnodes output:

& pbsnodes c-14
c-14
     state = offline
     np = 8
     properties = barcelona,bigmem
     ntype = cluster
     jobs = 6/
     status =
rectime=1316183711,varattr=,jobs=,state=free,netload=30221252289,gres=,loadave=0.27,ncpus=8,physmem=66180812kb,availmem=131971012kb,totmem=133289668kb,idletime=705610,nusers=0,nsessions=?
0,sessions=? 0,uname=Linux c-14 2.6.32-5-amd64 #1 SMP Tue Jun 14
09:42:28 UTC 2011 x86_64,opsys=linux
     gpus = 0

Regards,
Jörg Blank



More information about the torqueusers mailing list