[torqueusers] consensus on memory enforcement?
dgolden at cp.dias.ie
Tue Jun 6 04:28:51 MDT 2006
On 2006-06-02 16:59:16 -0400, garrick at speculation.org wrote:
> I've got users that are abusing memory usage on Linux nodes and I'd
> like to clear this up in TORQUE.
> What's the consensus on what changes should be made? Linux should use
> RLIMIT_AS and nothing else? How should the definitions of mem, pmem,
> pvmem, and vmem be clarified? Ake seems to be our resident expert on
> these matters.
Well, on a slightly related note, can I raise the
stack thing? Until fairly recently, stack limits were adjusted
as part of the mem stuff, IIRC. That was probably "wrong"
but allowed workaround for stack-hogs without code modification
(e.g. fortran compiled with a certain compiler). While it is
possible to in-job setrlimit to workaround , since an in-job
user-reset ulimit only applies to the mother superior's
daughter unless you take pains to do  in every parallel process,
(things like mpiexec are spawning via TM), and setting
ulimits of the moms themsleves for inheritance by children applies
to all jobs, not just known-stackhoggy ones:
How about a separate stackmem / pstackmem , that would change
RLIMIT_STACK per-job ? Note that e.g. aborting the job on overrun
might well be the Wrong Thing to do though - might
be overloading resource tracking (note that RLIMIT_STACK
counts towards RLIMIT_AS) for what should be just a job
More information about the torqueusers