[torqueusers] Cleaning up stray processes from defunct jobs
David Singleton
David.Singleton at anu.edu.au
Thu Sep 27 16:04:15 MDT 2012
On 09/28/2012 07:27 AM, Dave Ulrick wrote:
> On occasion I see a user run an MPI job via TORQUE that doesn't shut down
> cleanly and as a result leaves running processes behind to interfere with
> subsequent jobs that are assigned to its nodes. Any suggestions on how I
> might go about simplifying the task of finding and killing these
> processes?
>
Only support MPIs that use the tm API. You'll have to block ssh between
nodes to enforce this.
Cheers
David
More information about the torqueusers
mailing list