[torquedev] nodes+procs support in 2.5.0
knielson at adaptivecomputing.com
Wed Jul 21 15:22:02 MDT 2010
On 07/21/2010 02:34 PM, Martin Siegert wrote:
> I am testing resource requests of the form
> 1) #PBS -l nodes=1:ppn=2+procs=8
> 2) #PBS -l nodes=1:ppn=2
> #PBS -l procs=8
> I have verified that job submitted with either -l nodes=1:ppn=2 or
> -l procs=10 run correctly with 2.5.0.
> However, neither 1) nor 2) works:
> 1) # qsub is.pbs
> qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
> The problem appears to be in the proplist routine (node_manager.c)
> when it is called with
> (gdb) p str
> $2 = 0xe643868 "procs=8"
> from line 4516 in node_manager.c. But proplist only handles "ppn", see
> lines 2853ff
> if (strcmp(pname, "ppn") == 0)
> if ((number(&pequal, node_req) != 0) || (*pequal != '\0'))
> return(1); /* not recognized - error */
If a procs keyword is detected it is not taken care of in this part of
the code. If you look after this you will see in the fuction set_nodes
where procs is taken care of.
> Thus, proplist exits with "return(1)" from line 2864, which then causes
> qsub to abort with the error listed above.
> 2) This case "works" differently as the error does not come from torque
> itself, but the job is handed over to moab which then rejects the job:
> Message job cancelled - MOAB_INFO: job was rejected - job has invalid task layout
> I guess this is not a torque problem; this just indicates that moab does not
> support a combined request for nodes and procs (yet). It is dissappointing
> - Martin
This is disappointing. Moab should override TORQUE.
More information about the torquedev