[torquedev] Should a communication error between pbs_mom's kill a job ?

Bas van der Vlies basv at sara.nl
Wed May 6 03:13:26 MDT 2009


Chris Samuel wrote:
> ----- "Michael Barnes" <barnes at jlab.org> wrote:
> 
>> I've always modified the code so that a mom could not kill a job
>> whenever I install PBS/TORQUE. I cannot think of a reason why one mom
>> should terminate a job unless the job has actually gone over resource
>> limits as the function name implies.
> 
> Thanks - nice to know I'm not the only one who feels that way!
> 
> Does anyone else have any thoughts on this ?
> 
> Any objections to me submitting a patch to revert
> this behaviour ?
> 

Could this be an option in the mom config to turn this on or off?

Regards


-- 
********************************************************************
*  Bas van der Vlies                    e-mail: basv at sara.nl       *
*  SARA - Academic Computing Services   Amsterdam, The Netherlands *
********************************************************************


More information about the torquedev mailing list