[torqueusers] torque-1.2.0p6 - massive emails and job nanny
tonyv at sdsc.edu
Wed Sep 21 17:39:48 MDT 2005
There are no special configure options required. It worked fine for
me when I issued "set server job_nanny=true" in qmgr.
HPC Systems Engineer
San Diego Supercomputer Center
tonyv at sdsc.edu
On Sep 21, 2005, at 3:16 PM, Simon Gao wrote:
> when tried enabling job_nanny, I got following error:
> Qmgr: set server job_nanny=T
> qmgr: Syntax error - cannot locate attribute
> set server job_nanny=T
> The torque version is 1.2.0p6. Is there parameters required while
> compiling torque to add the attribute?
> Simon Gao
> Garrick Staples wrote:
>> On Mon, Sep 19, 2005 at 04:27:56PM -0700, Tony Vu alleged:
>>> Like some people on this list, our have users received multiple
>>> emails in the past when their jobs completed. We just recently
>>> upgraded to patch 6 and we are still seeing this problem. After
>>> browsing through this list, I read that the atttribute
>>> "job_nanny" needs to be turned on to alleviate this problem
>>> since by default it is not set.
>>> From what I understand, Torque will continually send multiple
>>> kill/ cancel/delete signals to an exiting job if for some reason
>>> it cannot communicate with the mother superior node on the
>>> initial try. Is this correct? If I set the job_nanny attribute
>>> to true will only the initial job delete signal be acknowledged
>>> and subsequent ones be ignored? Is this an option that needs to
>>> be turned on before compiling Torque in the configure script or
>>> is support for it compiled in by default?
>>> Also, is a server restart required if this server attribute is
>>> set or is it dynamic?
>> Yes, job_nanny causes subsequent deletes to be ignored. It's not a
>> compile-time option, just type 'set server job_nanny=T' into
>> qmgr. It
>> does not require a server restart.
>> torqueusers mailing list
>> torqueusers at supercluster.org
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers