[torqueusers] Memory resource limits and rlimits on Linux

David Chin chindw at wfu.edu
Thu Oct 14 15:17:24 MDT 2010


2010/10/14 "Mgr. Šimon Tóth" <SimonT at mail.muni.cz>:
>> 1. mom_over_limit()  in src/resmom/linux/mom_mach.c does NOT check
>> "mem", only vmem and pvmem. The patch that Anton Starikov attached to
>> the old thread did not make it into the source tree.
>
> What system? Linux node does check mem. See mom_set_limits().


This is for torque-2.5.2.

a) I am looking at src/resmom/linux/mom_mach.c mom_over_limit().
   The function mom_over_limit() does not check for "mem".

b) In mom_set_limits(), various rlimits are set based on
   Torque resources:

   line 1352 and onwards:

   The mem and/or pmem resource causes the system limit RLIMIT_DATA
   to be set. Linux 2.6 IGNORES RLIMIT_DATA. Try it. I used the attached
   glom.c program, adding in appropriate setrlimit() calls. glom takes
   one integer argument, the amount of MiB to allocate.

   As far as memory usage goes, Linux only terminates a program
   for violating RLIMIT_AS. In /etc/security/limits.conf, this
   is known as "as". In Bash ulimits, this is known as "virtual
   memory" (-v option), in Tcsh limits, this is known as
   "vmemoryuse". There are various mailing list and forum discussions
   on this point that I found that talk about this. It did take me
   days of hunting before I found this out: I was wondering why
   setting the "data" limit in /etc/security/limits.conf did
   nothing to restrict the growth of user programs.

   src/resmom/linux/mom_mach.c has a couple of places where
   RLIMIT_AS is set, but the code is either commented out,
   or ifdef'ed out.

   Here's an example interactive run of glom, from tcsh:
   $ ./glom 900
   Memory allocation test
   ======================

   Current system limits:
       cur as = 15734530048 = 15005 MiB
       max as = 15734530048 = 15005 MiB
       cur data = 734003200 = 700 MiB
       max data = 734003200 = 700 MiB

   Allocating 900 MiB
        x[0] = 0
        x[14745600] = 14745600
        x[29491200] = 29491200
        x[44236800] = 44236800
        x[58982400] = 58982400
        x[73728000] = 73728000
        x[88473600] = 88473600
        x[103219200] = 103219200


   Another test, this time setting RLIMIT_AS lower:

   Memory allocation test
   ======================

   Current system limits:
       cur as = 734003200 = 700 MiB
       max as = 734003200 = 700 MiB
       cur data = 734003200 = 700 MiB
       max data = 734003200 = 700 MiB

   Allocating 900 MiB
   allocation failed: Cannot allocate memory


>> 2. When setting mem, pmem, vmem, pvmem in the Torque script, only
>> "pmem" actually gets translated into an rlimit ("data"). The other
>> three resources (mem, vmem, and pvmem) are ignored. If I understand
>> correctly, that's correct behavior for mem and vmem, which are summed
>> limits over all processes in the job. But I would have thought setting
>> pvmem would have set the address space (aka virtual memory) limit.
>
> On Linux all four should be enforced. Limit gets stored into vmem_limit
> and mem_limit and then enforced.

That's not what I see. If I do not configure Maui to cancel jobs
for violating memory limits, a job will continue running even if
Torque is aware that it violates the memory limit. Doing a
"diagnose -j job_id" shows a message like this:

Name                  State Par Proc QOS     WCLimit R  Min     User
Group  Account  QueuedTime  Network  Opsys   Arch    Mem   Disk  Procs
   Class Features

438594              Running DEF    1 DEF  2:00:30:30 1    1     user
group        -    00:36:23   [NONE] [NONE] x86_64  >=512    >=0  NC0
[x86_64:1] [ethernet]
WARNING:  job '438594' utilizes more memory than dedicated (1309 > 512)


>> 3. While torque does cancel a job if it runs over its walltime
>> request, torque does nothing about jobs which run over their mem
>> request. It leaves that to the scheduler to cancel.
>
> Well, they shouldn't run over the limits in the first place.

True. But I have users whose code has memory leaks, or they haven't
observed their code properly to measure how much memory it requires.

As I have demonstrated above, memory limits which get translated into
RLIMIT_DATA in Linux are ineffectual. Linux completely ignores
RLIMIT_DATA.

Cheers,
Dave

--
David Chin, Ph.D.
chindw at wfu.edu                  High Performance Computing Systems Analyst
Office: 336-758-2964            Wake Forest University
Mobile: 336-608-0793            Winston-Salem, NC
Email-to-txt: 3366080793 at mms.att.net
Google Talk: chindw at wfu.edu
Web: http://www.wfu.edu/~chindw/
http://www.google.com/profiles/chindw.wfu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glom.c
Type: text/x-csrc
Size: 2018 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20101014/ccd331e5/attachment.bin 


More information about the torqueusers mailing list