[torquedev] nodes, procs, tpn and ncpus

Ken Nielson knielson at adaptivecomputing.com
Fri Jun 11 07:24:55 MDT 2010



----- Original Message -----
From: "Martin Siegert" <siegert at sfu.ca>
To: "Torque Developers mailing list" <torquedev at supercluster.org>
Sent: Thursday, June 10, 2010 6:32:11 PM
Subject: Re: [torquedev] nodes, procs, tpn and ncpus

On Thu, Jun 10, 2010 at 01:36:56PM -0600, Ken Nielson wrote:
> On 06/10/2010 12:27 PM, Martin Siegert wrote:
> >
> > That is not a solution. If we not set EXACTNODE, then users who need
> > nodes=N:ppn=1 (in its very meaning, namely exactly one processor per
> > node) cannot be satisfied. And if we do set EXACTNODE, there is no
> > way (other than procs) to request N processors anywhere. This is the
> > reason why procs was introduced in the first place: so that we can
> > set EXACTNODE
> > and satisfy both type of requests.
> >
> > Cheers,
> > Martin
> >
> >
> You may have seen in this discussion where Simon Toth and Glen Beane
> were indicating that nodes=x:ppn=y allocates y processors on x
> separate nodes and I was saying that it only allocates y processors on
> a single
> node.
>
> It ends up we were both right. It depends on what you have in your
> serverdb configuration. I have the server parameter
> resources_available.nodect set and Simon and Glen did not. Simon and
> Glen were running TORQUE's default behavior and TORQUE by default
> allocates nodes the same as if EXACTNODE were set in Moab.
>
> Moab muddies the waters by giving users the option to treat processors
> like nodes (vnodes in the case of PBS Pro). This is certainly one
> source of the confusion that exists on the meaning of different
> resources. While Moab is consistent in how it interprets the procs
> resource it has
> ambiguity with the nodes resource. If the JOBNODEMATCHPOLICY is not
> set (default) Moab treats processors as nodes. So -l nodes=x where x
> is greater than the physical nodes will be treated like -l procs=x
> provided TORQUE has set the available_resources.nodect parameter. By
> set I mean
> the nodect is greater than the number of physical nodes.
>
> After all this I just want to confirm what Martin has just written,
> that is procs exists so users can allocate a job with as many
> processors needed independent of the number of available nodes. We now
> just need
> TORQUE to recognize procs as well.
>
> Ken Nielson
> Adaptive Computing

just a comment: nodect used to be a parameter that was absolutely
essential in the pre-procs days when we did not set EXACTNODE:
in that configuration a nodes file with, e.g.,

n1 np=4
n2 np=4
... n200 np=4

would only allow you to run a job with a maximum of 200 processors
(using a -l nodes=N request). You needed to set nodect=800 to allow jobs
with -l nodes=400 or so. I always regarded nodect as an ugly workaround.
If it turns out that unsetting nodect (or eliminating nodect) plus
introducing procs basically implements the EXACTNODE + procs policies
in torque, then I believe that that is an excellent solution.

Martin,

Thank you very much for revealing that piece of history. 

If we could eliminate the dual meaning of nodes (one is a host and the other is a processor) by using procs life would be good. PBS Pro eliminated the nodect in their implementation and probably for the same reason. It would affect Moab, but Moab already uses procs so users could be re-educated.

Ken


More information about the torquedev mailing list