[torqueusers] Cleaning up stray processes from defunct jobs
Burkhard Bunk
bunk at physik.hu-berlin.de
Thu Sep 27 15:42:52 MDT 2012
Hi,
you should mention which implementation of MPI you are using.
"Dirty termination" of parallel runs was normal with mpich1, and there
was much discussion about possible ways (scripts) for cleanup.
With OpenMPI, however, this problems seems to be gone.
Regards,
Burkhard Bunk.
----------------------------------------------------------------------
bunk at physik.hu-berlin.de Physics Institute, Humboldt University
fax: ++49-30 2093 7628 Newtonstr. 15
phone: ++49-30 2093 7980 12489 Berlin, Germany
----------------------------------------------------------------------
On Thu, 27 Sep 2012, Dave Ulrick wrote:
> On occasion I see a user run an MPI job via TORQUE that doesn't shut down
> cleanly and as a result leaves running processes behind to interfere with
> subsequent jobs that are assigned to its nodes. Any suggestions on how I
> might go about simplifying the task of finding and killing these
> processes?
>
> Thanks,
> Dave
> --
> Dave Ulrick
> d-ulrick at comcast.net
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
More information about the torqueusers
mailing list