[torquedev] memory limit enforcement by pbs_mom - REQUEST FOR
ake.sandgren at hpc2n.umu.se
Tue Feb 7 04:51:57 MST 2006
On Mon, 2006-02-06 at 15:42 +0100, Åke Sandgren wrote:
> On Tue, 2006-01-31 at 13:18 -0700, Dave Jackson wrote:
> > Greetings,
> > Currently, the pbs_mom enforces memory limits specified with '-l
> > pmem=X' but does not enforce memory limits specified with '-l mem=X'
> > This is confusing for some users. I propose that we modify
> > mom_set_limits() to enforce stack and data segment limits if pmem is
> > specified or mem is specified and the job is serial.
> > This should have the impact that serial jobs now have mem limits
> > enforced. Are there any concerns with this change?
> After having read linux/mom_mach.c a couple of times i would suggest
> that pxxx limits get enforced with setrlimit whenever the corresponding
> xxx limit has been set, since if any process exceeds limit xxx the mom
> should kill the job anyway.
> Then we have the question of what (p)mem should really limit.
> As far as i know this could potentially be slightly different things on
> different archs depending on what is actually possible.
> On linux the only thing you can poll from outside is rss which means mem
> should limit rss and nothing else. This would then mean that pmem
> shouldn't try to enforce anything (since the kernel doesn't enforce
> RLIMIT_RSS) and pmem and mem should be polled for toghether with
> walltime, cput and vmem. Then if any limit is requested RLIMIT_DATA and
> RLIMIT_STACK should be raised (but probably not lowered) to the limit.
Attached is a first try (not tested, not even compiled) of what such a
change could look like.
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: ake at hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 10233 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20060207/09da6400/xxmem_limit-0001.bin
More information about the torquedev