[torqueusers] vmem and pvmem
David.Singleton at anu.edu.au
Fri Feb 24 16:50:00 MST 2012
On 02/25/2012 09:00 AM, Martin Siegert wrote:
> On Fri, Feb 24, 2012 at 11:19:37AM +0100, "Mgr. Šimon Tóth" wrote:
>>> Core_req vmem pvmem ulimit-v RPT
>>> nodes=1:ppn=2 1gb 256mb 256mb 512mb
>>> procs=2 1gb 256mb 256mb 1gb
>>> nodes=1:ppn=2 1gb 4gb 1gb 4gb
>>> procs=2 1gb 4gb 1gb 4gb
>>> nodes=1:ppn=2 1gb - 1gb 512mb
>>> procs=2 1gb - 1gb 1gb
>>> So the ulimit value that influences whether a task can allocate
>>> memory, is set as the lower of the vmem and pvmem values. That
>>> makes some sense - at least more sense than taking the larger
>>> value. What doesn't make sense is allowing pvmem to be higher
>>> than vmem in the first place - in that case torque should probably
>>> reject the job or 'fix' one of the settings but leaving it as is
>>> might not be so bad, except for moab's behaviour (keep reading).
>> No. The logic is as follows:
>> * if pvmem (or pmem) is set
>> then set the corresponding ulimit to pvmem (pmem) value
>> * if pvmem (or pmem) isn't set
>> then set the corresponding ulimit to vmem (mem) value
>> Note that using pvmem is mostly pointless. On Linux this represents
>> address space, not virtual memory.
>> You can use vmem as virtual memory, but even that is extremely confusing.
> I do not understand this comment. Both pvmem and vmem requests will
> result in RLIMIT_AS getting set.
I disagree with vmem setting RLIMIT_AS if that is what is happening.
> When I submit a MPI job using, e.g., procs=N, why is requesting
> pvmem=X mostly pointless? Shouldn't it be totally equivalent to
> requesting vmem=X*N ?
I think we have had the discussion of what procs means on a number of
occasions (look for the thread "processes vs processors"). I believe "procs"
(now) means (virtual) processORs (most commonly, they are cores). They are not
processes. [In OpenPBS they were processes and only the UNICOS MOM supported
that limit. At least in torque-3.0.2 procs is still not properly documented
in pbs_resources* man pages.]
pvmem sets some sort of memory limit per *process* so vmem should have nothing
to do with procs and pvmem. pvmem and vmem are pretty much orthogonal. One is
a voluntary limit the user places on their job processes (useless for actual
resource scheduling) and the other is something any well-configured system
should require a user to specify so that the resources of the system can be
managed. In particular a job with only a pvmem limit can OOM any size node
simply by spawning enough processes.
Setting both independently (should a user choose to do so) seems perfectly
sensible. But I agree with Gareth that it only makes sense to request
vmem. Now what vmem actually is and how is should be evaluated and limited is
a whole other discussion ...
More information about the torqueusers