[torquedev] Should a communication error between pbs_mom's kill a job ?

Chris Samuel csamuel at vpac.org
Tue May 5 18:38:33 MDT 2009


----- "Michael Barnes" <barnes at jlab.org> wrote:

> I've always modified the code so that a mom could not kill a job
> whenever I install PBS/TORQUE. I cannot think of a reason why one mom
> should terminate a job unless the job has actually gone over resource
> limits as the function name implies.

Thanks - nice to know I'm not the only one who feels that way!

Does anyone else have any thoughts on this ?

Any objections to me submitting a patch to revert
this behaviour ?

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the torquedev mailing list