[torquedev] [Bug 93] Resource management semantics of Torque need to be well defined

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Thu Oct 28 14:40:54 MDT 2010


dbeer at adaptivecomputing.com changed:

           What    |Removed                     |Added
                 CC|                            |dbeer at adaptivecomputing.com

--- Comment #5 from dbeer at adaptivecomputing.com 2010-10-28 14:40:54 MDT ---
(In reply to comment #4)
> (In reply to comment #3)
> > > 
> > > > PPN = processors per node (according to manual page), really virtual processors
> > > > as you can overcommit if you are not using cpusets.  I've seen plenty of
> > > > commercial software out there that uses them, so I don't think it can go away. 
> > > > The pvmem limits which you mention are vital to us.
> > > 
> > > Well, that's the problem, then manual page says processors per node, but that's
> > > not how Torque works (this is exactly the reason why I created this bug). They
> > > are processes per node. I'm not saying to get rid of ppn, but to get rid of the
> > > processes semantics, therefore ppn will be actually processors not processes.
> > > pvmem can actually stay, although I think pmem and pvmem can be easily
> > > superseded by mem and vmem.
> > 
> > I understand the frustration with ppn not really meaning processors per node.
> > However, the current behavior of ppn is widely used and expected. We need to
> > live with this. Changing this behavior will break too many people.
> In what way are they using it as processes?  Are they requesting the MOM call
> setrlimit(RLIMIT_NPROC)?  Are they killing jobs if jobs are detected as having
> more than that many processes running on a node?  None of these make any sense
> whatsoever (unless some large forkbomb limit is applied - but that should be a
> system limit, not a user resource request).  
> Is the ppn value being used to impose pvmem or pmem limits some how? I dont see
> that in the Torque code?  By external schedulers?  How?
> I suspect "processes per node" only really appears in flawed and misleading
> documentation, not in real code.

Processes per node is often how it is explained, although you are right, it
isn't restricted in any way to actually limit the number of processes that can
be run. It may have originally been intended to be processors per node, but now
almost all processors intended for computing have multiple cores, making
processors per node completely ambiguous and therefore not very useful.

However, it is in the code in a few ways:

ppn is the number of times that nodename will appear in the $PBS_NODEFILE. This
is intended to be read by the mpi scripts on the program to then make that many
processes. There is nothing in TORQUE that stops the scripts from spawning more
processes though.

ppn is left completely configurable per node, and so the notion that it is tied
to the actual hardware is false. Often in production systems, ppn becomes cores
per node, because that's how many the system admin wants for optimal use. 

The fact of the matter is that ppn hasn't been clearly defined over time, and
what it has become in practice is probably best described as processes per
node. At any rate, changing this behavior would greatly disrupt life for *very*
many TORQUE users.

Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the torquedev mailing list