[torqueusers] How to enforce pmem requirements

David Singleton David.Singleton at anu.edu.au
Wed Feb 18 13:44:36 MST 2009


Strictly speaking, pmem limits cant always stop nodes running out of
memory even if enforced.

  a. A job can start an arbitrary number of processes none of which
     exceed the pmem limit.

  b. It is conceivable for apparently reasonable pmem limits to never
     be hit by a job that fills swap.  Consider a 4 cpu node with
     4GB of memory.  A reasonable pmem limit would apparently be
     1GB.  However 4 processes growing memory use at the same rate
     will never reach that limit.  They will start paging at some
     lower value and can continue paging until the node runs out
     of swap.

My other problem with pmem (and mem) limits is that they are unpredictable.
The same job running on the same node may run totally under the limit
one run and hit the limit on another run.  Process physical memory
use depends not only on the job/process but also on the system state.

Sorry for not being helpful.

David

Roger Moye wrote:
> 
> We have Torque/Moab running on one cluster and Torque/Maui on another.  
> We encourage our users to use the pmem option to specify their memory 
> requirements in their PBS batch scripts.  Is there a way to get the 
> scheduler to enforce these limits?  That is, if a job attempts to exceed 
> the pmem value we want the scheduler to kill the job just like it would 
> if it exceeded its walltime.  Currently we have a few users who have 
> their jobs exceed their pmem value and the result is trashed nodes 
> because the jobs have consumed too much memory.
> 
> Thanks in advance for any help or advice!
> -Roger
> 


More information about the torqueusers mailing list