[torqueusers] Problem with ppn and routing : Possible way to get the routing you want continued.
Glen Beane
glen.beane at gmail.com
Fri Dec 3 05:47:17 MST 2010
On Thu, Dec 2, 2010 at 4:33 PM, Ken Nielson
<knielson at adaptivecomputing.com> wrote:
> The TORQUE "resource manager" knows nothing of ncpus. When a job is submitted and the ncpus keyword is used the string is passed through to the scheduler. In the case of PBS this would be pbs_sched. If you run torque without a scheduler and call qrun for a job you will get a single node and a single processor to run the job.
Yup, you're right. I just did a quick test. I have 32 core nodes, but
if I disable Moab, request ncpus > 32, and run the job with qrun
TORQUE will happily run it. So all the logic for ncpus is in the
scheduler, while if I queue a job with nodes=1:ppn=33 TORQUE knows
that there are no nodes that meet this resource requirement.
[gbeane at rockhopper ~]$ mschedctl -p
scheduling will be disabled, cluster information will continue to be updated
[gbeane at rockhopper ~]$ echo "hostname" | qsub -l ncpus=33,walltime=00:01:00
1158.scyld.localdomain
[gbeane at rockhopper ~]$ qrun 1158
[gbeane at rockhopper ~]$ qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
1158.scyld STDIN gbeane 00:00:00 C batch
[gbeane at rockhopper ~]$ echo "hostname" | qsub -l
nodes=1:ppn=33,walltime=00:01:00
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
[gbeane at rockhopper ~]$
More information about the torqueusers
mailing list