[torquedev] nodes+procs support in 2.5.0
Martin Siegert
siegert at sfu.ca
Wed Jul 21 14:34:59 MDT 2010
Hi,
I am testing resource requests of the form
1) #PBS -l nodes=1:ppn=2+procs=8
and
2) #PBS -l nodes=1:ppn=2
#PBS -l procs=8
I have verified that job submitted with either -l nodes=1:ppn=2 or
-l procs=10 run correctly with 2.5.0.
However, neither 1) nor 2) works:
1) # qsub is.pbs
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
The problem appears to be in the proplist routine (node_manager.c)
when it is called with
(gdb) p str
$2 = 0xe643868 "procs=8"
from line 4516 in node_manager.c. But proplist only handles "ppn", see
lines 2853ff
if (strcmp(pname, "ppn") == 0)
{
pequal++;
if ((number(&pequal, node_req) != 0) || (*pequal != '\0'))
{
return(1);
}
}
else
{
return(1); /* not recognized - error */
}
Thus, proplist exits with "return(1)" from line 2864, which then causes
qsub to abort with the error listed above.
2) This case "works" differently as the error does not come from torque
itself, but the job is handed over to moab which then rejects the job:
Message[0] job cancelled - MOAB_INFO: job was rejected - job has invalid task layout
I guess this is not a torque problem; this just indicates that moab does not
support a combined request for nodes and procs (yet). It is dissappointing
nevertheless.
- Martin
--
Martin Siegert
Head, Research Computing
WestGrid/ComputeCanada Site Lead
IT Services phone: 778 782-4691
Simon Fraser University fax: 778 782-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the torquedev
mailing list