[torquedev] nodes+procs support in 2.5.0

Martin Siegert siegert at sfu.ca
Wed Jul 21 16:17:28 MDT 2010


On Wed, Jul 21, 2010 at 03:22:02PM -0600, Ken Nielson wrote:
> On 07/21/2010 02:34 PM, Martin Siegert wrote:
> > Hi,
> >
> > I am testing resource requests of the form
> >
> > 1) #PBS -l nodes=1:ppn=2+procs=8
> > and
> > 2) #PBS -l nodes=1:ppn=2
> >     #PBS -l procs=8
> >
> > I have verified that job submitted with either -l nodes=1:ppn=2 or
> > -l procs=10 run correctly with 2.5.0.
> > However, neither 1) nor 2) works:
> >
> > 1) # qsub is.pbs
> > qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
> >
> > The problem appears to be in the proplist routine (node_manager.c)
> > when it is called with
> >
> > (gdb) p str
> > $2 = 0xe643868 "procs=8"
> >
> > from line 4516 in node_manager.c. But proplist only handles "ppn", see
> > lines  2853ff
> >
> >        if (strcmp(pname, "ppn") == 0)
> >          {
> >          pequal++;
> >
> >          if ((number(&pequal, node_req) != 0) || (*pequal != '\0'))
> >            {
> >            return(1);
> >            }
> >          }
> >        else
> >          {
> >          return(1); /* not recognized - error */
> >          }
> >    
> If a procs keyword is detected it is not taken care of in this part of 
> the code. If you look after this you will see in the fuction set_nodes 
> where procs is taken care of.

But I never get there as far as I can tell:
proplist basically does

if (strcmp(pname, "ppn") == 0) {
   ...
} else {
   return(1); /* not recognized - error */
}

Thus when pname is not ppn the routine returns an error, which causes
qsub to exit.

> > Thus, proplist exits with "return(1)" from line 2864, which then causes
> > qsub to abort with the error listed above.
> >
> > 2) This case "works" differently as the error does not come from torque
> > itself, but the job is handed over to moab which then rejects the job:
> >
> > Message[0] job cancelled - MOAB_INFO:  job was rejected - job has invalid task layout
> >
> > I guess this is not a torque problem; this just indicates that moab does not
> > support a combined request for nodes and procs (yet). It is dissappointing
> > nevertheless.
> >
> > - Martin
> >
> >    
> This is disappointing. Moab should override TORQUE.
> 
> Ken

I believe in moab procs used to overwrite nodes. But now even that does
not happen anymore.

- Martin


More information about the torquedev mailing list