[torquedev] nodes+procs support in 2.5.0

Martin Siegert siegert at sfu.ca
Wed Jul 21 14:34:59 MDT 2010


Hi,

I am testing resource requests of the form

1) #PBS -l nodes=1:ppn=2+procs=8
and
2) #PBS -l nodes=1:ppn=2
   #PBS -l procs=8

I have verified that job submitted with either -l nodes=1:ppn=2 or
-l procs=10 run correctly with 2.5.0.
However, neither 1) nor 2) works:

1) # qsub is.pbs
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes

The problem appears to be in the proplist routine (node_manager.c)
when it is called with 

(gdb) p str                                                                    
$2 = 0xe643868 "procs=8"

from line 4516 in node_manager.c. But proplist only handles "ppn", see
lines  2853ff

      if (strcmp(pname, "ppn") == 0)
        {
        pequal++;

        if ((number(&pequal, node_req) != 0) || (*pequal != '\0'))
          {
          return(1);
          }
        }
      else
        {
        return(1); /* not recognized - error */
        }

Thus, proplist exits with "return(1)" from line 2864, which then causes
qsub to abort with the error listed above.

2) This case "works" differently as the error does not come from torque
itself, but the job is handed over to moab which then rejects the job:

Message[0] job cancelled - MOAB_INFO:  job was rejected - job has invalid task layout

I guess this is not a torque problem; this just indicates that moab does not
support a combined request for nodes and procs (yet). It is dissappointing
nevertheless.

- Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid/ComputeCanada Site Lead
IT Services                                phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6


More information about the torquedev mailing list