[torqueusers] jobs completing with processes still running - SOLVED

Chris Samuel csamuel at vpac.org
Thu May 8 21:04:57 MDT 2008


----- "Jerry Smith" <jdsmit at sandia.gov> wrote:

> I would have to second this thought (OpenMPI, as well as OSC's mpiexec
> for your current setup).

I've started seeing the occasional OpenMPI "mpirun" get
left behind on nodes after a job has gone for no apparent
reason. Dunno why it's not getting kill by PBS.

It's not the children, just the original mpirun command.

It doesn't use any CPU, so it doesn't bother us much, we
just get nagged nightly when our housekeeping cron job
notices user processes on nodes they don't have jobs on.

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the torqueusers mailing list