[torqueusers] torque-1.2.0p6 - massive emails and job nanny

Garrick Staples garrick at usc.edu
Tue Sep 20 09:27:08 MDT 2005


On Mon, Sep 19, 2005 at 04:27:56PM -0700, Tony Vu alleged:
> Hello,
> 
> Like some people on this list, our have users received multiple  
> emails in the past when their jobs completed.  We just recently  
> upgraded to patch 6 and we are still seeing this problem.  After  
> browsing through this list, I read that the atttribute "job_nanny"  
> needs to be turned on to alleviate this problem since by default it  
> is not set.
> 
> From what I understand, Torque will continually send multiple kill/ 
> cancel/delete signals to an exiting job if for some reason it cannot  
> communicate with the mother superior node on the initial try.  Is  
> this correct?  If I set the job_nanny attribute to true will only the  
> initial job delete signal be acknowledged and subsequent ones be  
> ignored?  Is this an option that needs to be turned on before  
> compiling Torque in the configure script or is support for it  
> compiled in by default?
> 
> Also, is a server restart required if this server attribute is set or  
> is it dynamic?

Yes, job_nanny causes subsequent deletes to be ignored.  It's not a
compile-time option, just type 'set server job_nanny=T' into qmgr.  It
does not require a server restart.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050920/f02e1845/attachment.bin


More information about the torqueusers mailing list