[torqueusers] pbs_mom request, was Re: PBS_MOM kills running jobs when restarted

Chris Samuel csamuel at vpac.org
Mon Dec 14 20:01:45 MST 2009


----- "Douglas Wade Needham" <dneedham at cmu.edu> wrote:

> I am thinking that the vast majority of the hardware
> faults did not behave that way.

It only happened once to us, but the impact was very
great so we decided it wasn't worth taking the risk
in future.

We've never found that to be a problem, if we ever
do need to start all the pbs_mom's on all the nodes
we just have to do:

# pdsh -a pbs start

on the management node and that's it.

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the torqueusers mailing list