Bug 78 - fix routing of jobs based on nodes spec
: fix routing of jobs based on nodes spec
Status: RESOLVED WONTFIX
Product: TORQUE
pbs_server
: 2.5.x
: PC Linux
: P5 major
Assigned To: Glen
:
:
:
  Show dependency treegraph
 
Reported: 2010-08-20 18:12 MDT by Martin Siegert
Modified: 2010-08-31 16:23 MDT (History)
3 users (show)

See Also:


Attachments
patch for comparing nodes specifications (3.84 KB, patch)
2010-08-20 18:12 MDT, Martin Siegert
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description Martin Siegert 2010-08-20 18:12:23 MDT
Created an attachment (id=52) [details]
patch for comparing nodes specifications

As explained in
http://www.supercluster.org/pipermail/torquedev/2010-August/002724.html
routing of jobs based on a nodes specification currently does not work.
The attached patch changes this behaviour such that routing for a nodes
specification
-l nodes=x1:ppn=y1+x2:ppn=y2+...
is based on the processor count x1*y1+x2+y2+...
Comment 1 Simon Toth 2010-08-20 23:19:15 MDT
This should be probably harmonized with
http://www.clusterresources.com/bugzilla/show_bug.cgi?id=67 to work with any
resource (the patch already counts all resources in both the nodespec and
resource requests).
Comment 2 Martin Siegert 2010-08-23 18:57:34 MDT
I am fully aware that this patch is incomplete as far as future plans are
concerned, e.g., bug 67 or combined requests like nodes=...+procs=...
Thus, the patch repairs only the routing code that is in the current
version (and probably almost all previous versions) that is clearly
broken with respect to nodes specifications. As such it is needed - we are
running into problems with "misrouted" jobs right now.

With respect to future plans: I would like to see a "standard" for the
grammar first. E.g., the problem I ran into when writing the patch is
the a nodes specification like

-l nodes=4:ppn=8+n15+procs=15

is actually quite ugly: currently the code uses isdigit to decide whether
the entry after the + sign is a node name (like n15) or specifies a number
of nodes (like 4).
This logic obviously fails when we add something like "+procs=15" to the
mix. Currently "procs" would actually be interpreted as a node name. If we
do want to allow this we need to parse the nodes specification for procs first
and then split the string appropriately. However, the string above is supposed
to be equivalent to

-l nodes=4:ppn=8+n15
-l procs=15

which actually works with the current code. Should this be the only way of
specifying a combined nodes+procs resource?

With respect to bug 67: despite the explanations given I am still not clear
what the proposed syntax actually means. E.g.,
-l nodes=1:ppn=2:ncpus=3
Currently, nodes=1:ppn=2 means 1 node, 2 processors per node; ncpus=3 is
basically equivalent to nodes=1:ppn=3 (even though this is not handled by
torque, schedulers treat it that way and users are used to it). Thus, to me
bug 67 changes the meaning of these resources requests. Is that what we want?
And if yes, what does nodes=1:ppn=2:ncpus=3 mean? How does the procs resource
get incorporated into this scheme?

Frankly, I mostly care about nodes=x:ppn=y and procs=z and any combination
of those when it comes to requesting processors. I need that to work. I am
open to other changes, but currently I do not understand the meaning of the
type of requests introduced in bug 67 and therefore have a hard time dealing
with them.

- Martin
Comment 3 Simon Toth 2010-08-23 23:59:13 MDT
> I am fully aware that this patch is incomplete as far as future plans are
> concerned, e.g., bug 67 or combined requests like nodes=...+procs=...
> Thus, the patch repairs only the routing code that is in the current
> version (and probably almost all previous versions) that is clearly
> broken with respect to nodes specifications. As such it is needed - we are
> running into problems with "misrouted" jobs right now.

My comment wasn't about the "incompleteness" of this patch, but rather about
the fact, that the 67 patch already calculates all resources (into the
ResourcesTotal job attribute). Which can be then checked when routing. Much
better then hacking existing functions to "sort of" support nodespec IMHO.

> With respect to future plans: I would like to see a "standard" for the
> grammar first. E.g., the problem I ran into when writing the patch is
> the a nodes specification like
> 
> -l nodes=4:ppn=8+n15+procs=15
> 
> is actually quite ugly: currently the code uses isdigit to decide whether
> the entry after the + sign is a node name (like n15) or specifies a number
> of nodes (like 4).
> This logic obviously fails when we add something like "+procs=15" to the
> mix. Currently "procs" would actually be interpreted as a node name. If we
> do want to allow this we need to parse the nodes specification for procs first
> and then split the string appropriately. However, the string above is supposed
> to be equivalent to
> 
> -l nodes=4:ppn=8+n15
> -l procs=15
> 
> which actually works with the current code. Should this be the only way of
> specifying a combined nodes+procs resource?

Well, yes this is one of the issues that the 67 bug solves.


> With respect to bug 67: despite the explanations given I am still not clear
> what the proposed syntax actually means. E.g.,
> -l nodes=1:ppn=2:ncpus=3

That is a decision to be made. It's set using resource flags in the
resc_def_all. Current meaning is 1 node, 2 processes, 3 cpus per process. You
can easily set it to mean 1 node 2 processes 3 cpus total.

There are two approaches I have seen, when it came to ppn vs. ncpus. One is to
keep the logic of processes (which is odd and hard to grasp), the second one is
to make ppn and ncpus synonymous (which is a bit hard to enforce).

> Currently, nodes=1:ppn=2 means 1 node, 2 processors per node; ncpus=3 is
> basically equivalent to nodes=1:ppn=3 (even though this is not handled by
> torque, schedulers treat it that way and users are used to it). Thus, to me
> bug 67 changes the meaning of these resources requests. Is that what we want?
> And if yes, what does nodes=1:ppn=2:ncpus=3 mean? How does the procs resource
> get incorporated into this scheme?

Torques features are a bit behind in this area. Simply not using ppn and ncpus
together would be the good approach right now. Either set ppn=1 on each node
and set the cpus using ncpus or set ppn=cpus and don't set ncpus on the node.

> Frankly, I mostly care about nodes=x:ppn=y and procs=z and any combination
> of those when it comes to requesting processors. I need that to work. I am
> open to other changes, but currently I do not understand the meaning of the
> type of requests introduced in bug 67 and therefore have a hard time dealing
> with them.

I just realized that nodect and procs are actually not propagated into the
resources total attribute (but are still counted). I will look into this.

The correct result for -l nodes=x:ppn=y+procs=z would be resources total:
nodect=x, procs=x*y+z
Comment 4 Ken Nielson 2010-08-31 16:23:16 MDT
Martin,

We are not going to accept this patch for a couple of reasons.

The first reason is we are planning to address the syntax issue soon. Instead
of adding an additional syntax that will likely be overridden shortly we will
be better off addressing the global issue.

The second reason is that allowing a nodes=x:ppn=y on the resources_max.nodes
and resources_min.nodes creates and ambiguity. The solution you have proposed
changes the meaning of nodes to processors. While schedulers like Moab will
allow nodes and processors to mean the same thing, Moab handles the
interpretation. TORQUE still looks at everything as nodes first and processors
as a resource on a node.

It might be useful to have a way to set queue limits with the nodes=x:ppn=y
concept. But we will address that later.

Thanks for the patch. The help from the community is vital to the success of
the TORQUE project.