[torqueusers] job exceed memory limit without been killed

Anton Starikov ant.starikov at gmail.com
Mon Mar 22 11:44:01 MDT 2010


I perfectly understand what pmem and pvmem do.
Problem here is mem and vmem. They don't and shouldn't put ulimits, because it is limit of total memory consumed by all processes, which can be distributed over bunch of nodes.

If I have 16 processes on 16 core node, I don't really want to set pmem if this program doesn't have uniform memory usage between processes. And there is no sense to set per process ulimit to "mem", because it has no sense. 
But tracking "mem" boundary of memory consumed by processes together is perfectly OK.

And if this 16 processes are distributed over 2 nodes, then "mem" limit works as it should. But if they reside on one node, it doesn't. 
Regardless which behavior you believe to be correct, it is bug. It should ignore total mem limit in both cases or enforce mem limit in both cases, but not like it does now.

Anton.


On Mar 20, 2010, at 1:18 PM, Chris Samuel wrote:

> On Fri, 19 Mar 2010 12:26:05 am Anton Starikov wrote:
> 
>> Which means that PBS_MOM already registered memory usage above limit and
>> even updated this information on server, but didn't react and kill the
>> job.
>> 
>> What can be wrong? Do I miss something in the config?
> 
> I think you are misunderstanding what the mem/vmem/pmem/pvmem limits in Torque 
> actually do - they apply resource limits (ulimits in the shell, RLIMIT's in 
> terms of kernel APIs) to the processes that are launched by pbs_mom.
> 
> The problem is that in the old days malloc() in glibc just called brk() and in 
> the Linux kernel brk() obeys the RLIMIT_DATA limit which pbs_mom sets for mem 
> and pmem.
> 
> But then glibc changed and now calls mmap() for allocations over a certain 
> size and mmap() in the Linux kernel does not observe RLIMIT_DATA.
> 
> Perhaps the simplest fix is to translate any reference of mem or pmem to vmem 
> or pvmem as they will set the RLIMIT_AS limit which is observed by 
> RLIMIT_DATA, or use the Maui/Moab tricks which use the data reported by the 
> node to decide whether or not to kill the job.
> 
> For more information on the various RLIMIT's see the setrlimit() manual page.
> 
> cheers!
> Chris
> -- 
> Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> 
> This email may come with a PGP signature as a file. Do not panic.
> For more info see: http://en.wikipedia.org/wiki/OpenPGP
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list