[torqueusers] Nodes that pbs reports are busy which are actually running a job

Garrick Staples garrick at usc.edu
Wed Aug 11 16:13:09 MDT 2010


On Wed, Aug 11, 2010 at 04:59:07PM -0500, Rahul Nabar alleged:
> On Wed, Aug 11, 2010 at 4:53 PM, Garrick Staples <garrick at usc.edu> wrote:
> >
> > Nope, it doesn't have a job. What you have are stale processes from an old job.
> 
> Thanks! I killed them, Does PBS cleanup processes after a job ends
> automatically? Or is there a suitable flag? These are non-shared nodes
> so no risk of stepping on another jobs processes. All 8 cores are
> always assigned to same user.
> 
> If not is it a OK fix to put a pkill in the epilogue for all normal
> usernames. Any caveats? Or better ideas?

It will kill processes that it knows about. This includes any children of the
batch script and any processes launched through the TM interface. Any remote
processes started through a remote shell are unknown to PBS and can't be
killed. It is up to your epilogue to figure out what else needs to be killed.

-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Life is Good!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100811/6493e08f/attachment.bin 


More information about the torqueusers mailing list