[torquedev] Should a communication error between pbs_mom's kill a job ?
Bas van der Vlies
basv at sara.nl
Wed May 6 03:13:26 MDT 2009
Chris Samuel wrote:
> ----- "Michael Barnes" <barnes at jlab.org> wrote:
>
>> I've always modified the code so that a mom could not kill a job
>> whenever I install PBS/TORQUE. I cannot think of a reason why one mom
>> should terminate a job unless the job has actually gone over resource
>> limits as the function name implies.
>
> Thanks - nice to know I'm not the only one who feels that way!
>
> Does anyone else have any thoughts on this ?
>
> Any objections to me submitting a patch to revert
> this behaviour ?
>
Could this be an option in the mom config to turn this on or off?
Regards
--
********************************************************************
* Bas van der Vlies e-mail: basv at sara.nl *
* SARA - Academic Computing Services Amsterdam, The Netherlands *
********************************************************************
More information about the torquedev
mailing list