[torquedev] nodes, procs, tpn and ncpus

Martin Siegert siegert at sfu.ca
Thu Jun 10 18:32:11 MDT 2010


On Thu, Jun 10, 2010 at 01:36:56PM -0600, Ken Nielson wrote:
> On 06/10/2010 12:27 PM, Martin Siegert wrote:
> >
> > That is not a solution. If we not set EXACTNODE, then users who need
> > nodes=N:ppn=1 (in its very meaning, namely exactly one processor per
> > node) cannot be satisfied. And if we do set EXACTNODE, there is no way
> > (other than procs) to request N processors anywhere. This is the reason
> > why procs was introduced in the first place: so that we can set EXACTNODE
> > and satisfy both type of requests.
> >
> > Cheers,
> > Martin
> >
> >    
> You may have seen in this discussion where Simon Toth and Glen Beane 
> were indicating that nodes=x:ppn=y allocates y processors on x separate 
> nodes and I was saying that it only allocates y processors on a single 
> node.
> 
> It ends up we were both right. It depends on what you have in your 
> serverdb configuration. I have the server parameter 
> resources_available.nodect set and Simon and Glen did not. Simon and 
> Glen were running TORQUE's default behavior and TORQUE by default 
> allocates nodes the same as if EXACTNODE were set in Moab.
> 
> Moab muddies the waters by giving users the option to treat processors 
> like nodes (vnodes in the case of PBS Pro). This is certainly one source 
> of the confusion that exists on the meaning of different resources. 
> While Moab is consistent in how it interprets the procs resource it has 
> ambiguity with the nodes resource. If the JOBNODEMATCHPOLICY is not set 
> (default) Moab treats processors as nodes. So -l nodes=x where x is 
> greater than the physical nodes will be treated like -l procs=x provided 
> TORQUE has set the available_resources.nodect parameter. By set I mean 
> the nodect is greater than the number of physical nodes.
> 
> After all this I just want to confirm what Martin has just written, that 
> is procs exists so users can allocate a job with as many processors 
> needed independent of the number of available nodes. We now just need 
> TORQUE to recognize procs as well.
> 
> Ken Nielson
> Adaptive Computing

just a comment: nodect used to be a parameter that was absolutely
essential in the pre-procs days when we did not set EXACTNODE:
in that configuration a nodes file with, e.g.,

n1 np=4
n2 np=4
...
n200 np=4

would only allow you to run a job with a maximum of 200 processors
(using a -l nodes=N request). You needed to set nodect=800 to allow jobs
with -l nodes=400 or so. I always regarded nodect as an ugly workaround.
If it turns out that unsetting nodect (or eliminating nodect) plus
introducing procs basically implements the EXACTNODE + procs policies
in torque, then I believe that that is an excellent solution.

Cheers,
Martin


More information about the torquedev mailing list