[torqueusers] ulimit -l

David Golden dgolden at cp.dias.ie
Wed Mar 28 06:45:20 MDT 2007


On Wednesday 28 March 2007 00:16, Brock Palen wrote:

> >
> > Then the next thing is you have to restart the torque daemons on
> > each system.  The reason for this is if they were started before
> > you made these changes they hold the old limits and all processes
> > they spawn will inherit those limits.  So restarting will get
> > Torque going with the "unlimited" mode so child processes inherit
> > that.
> >

I find it safest to add explicit ulimit calls in the etc/init.d/torque_mom
startup script or equivalent too - otherwise, because of the way the 
pam_limits module works  (...at a login...), you can find that if you reboot 
a node then the mom still autostarts with lower limits because init is 
getting the system,er, default defaults, but then when you restart the mom 
interactively, the mom gets the pam-set limits from root's login session and 
appears to work fine (i.e. it gets large limits and its large limits get 
inherited by its children...) -  confusing the first time it happens...

e.g.
...
# Set some batch job limits
ulimit -l 4194304
ulimit -n 65535
ulimit -s 4194304
..

Note also that if you're doing multithreaded stuff, if you set a stack limit
of "unlimited", that at present the most common linux threading implementation 
may fall back to a teeny tiny limit for all but the primary thread - at the 
moment, it's better to use a large but finite limit, the threading 
implementation then makes all threads have a large limit. 

(n.b. the above large finite limits are a bit too large for 32-bit 
platforms...)




More information about the torqueusers mailing list