[torqueusers] ulimit -l
David Golden
dgolden at cp.dias.ie
Wed Mar 28 06:45:20 MDT 2007
On Wednesday 28 March 2007 00:16, Brock Palen wrote:
> >
> > Then the next thing is you have to restart the torque daemons on
> > each system. The reason for this is if they were started before
> > you made these changes they hold the old limits and all processes
> > they spawn will inherit those limits. So restarting will get
> > Torque going with the "unlimited" mode so child processes inherit
> > that.
> >
I find it safest to add explicit ulimit calls in the etc/init.d/torque_mom
startup script or equivalent too - otherwise, because of the way the
pam_limits module works (...at a login...), you can find that if you reboot
a node then the mom still autostarts with lower limits because init is
getting the system,er, default defaults, but then when you restart the mom
interactively, the mom gets the pam-set limits from root's login session and
appears to work fine (i.e. it gets large limits and its large limits get
inherited by its children...) - confusing the first time it happens...
e.g.
...
# Set some batch job limits
ulimit -l 4194304
ulimit -n 65535
ulimit -s 4194304
..
Note also that if you're doing multithreaded stuff, if you set a stack limit
of "unlimited", that at present the most common linux threading implementation
may fall back to a teeny tiny limit for all but the primary thread - at the
moment, it's better to use a large but finite limit, the threading
implementation then makes all threads have a large limit.
(n.b. the above large finite limits are a bit too large for 32-bit
platforms...)
More information about the torqueusers
mailing list