[torqueusers] Nodes that pbs reports are busy which are actually running a job
Garrick Staples
garrick at usc.edu
Wed Aug 11 16:13:09 MDT 2010
On Wed, Aug 11, 2010 at 04:59:07PM -0500, Rahul Nabar alleged:
> On Wed, Aug 11, 2010 at 4:53 PM, Garrick Staples <garrick at usc.edu> wrote:
> >
> > Nope, it doesn't have a job. What you have are stale processes from an old job.
>
> Thanks! I killed them, Does PBS cleanup processes after a job ends
> automatically? Or is there a suitable flag? These are non-shared nodes
> so no risk of stepping on another jobs processes. All 8 cores are
> always assigned to same user.
>
> If not is it a OK fix to put a pkill in the epilogue for all normal
> usernames. Any caveats? Or better ideas?
It will kill processes that it knows about. This includes any children of the
batch script and any processes launched through the TM interface. Any remote
processes started through a remote shell are unknown to PBS and can't be
killed. It is up to your epilogue to figure out what else needs to be killed.
--
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California
Life is Good!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100811/6493e08f/attachment.bin
More information about the torqueusers
mailing list