[torqueusers] Problem with ppn and routing : Possible way to get the routing you want continued.

Glen Beane glen.beane at gmail.com
Fri Dec 3 05:47:17 MST 2010


On Thu, Dec 2, 2010 at 4:33 PM, Ken Nielson
<knielson at adaptivecomputing.com> wrote:
> The TORQUE "resource manager" knows nothing of ncpus. When a job is submitted and the ncpus keyword is used the string is passed through to the scheduler. In the case of PBS this would be pbs_sched. If you run torque without a scheduler and call qrun for a job you will get a single node and a single processor to run the job.

Yup, you're right.  I just did a quick test. I have 32 core nodes, but
if I disable Moab, request ncpus > 32, and run the job with qrun
TORQUE will happily run it.  So all the logic for ncpus is in the
scheduler,  while if I queue a job with nodes=1:ppn=33 TORQUE knows
that there are no nodes that meet this resource requirement.



[gbeane at rockhopper ~]$ mschedctl -p

scheduling will be disabled, cluster information will continue to be updated

[gbeane at rockhopper ~]$ echo "hostname" | qsub -l ncpus=33,walltime=00:01:00
1158.scyld.localdomain
[gbeane at rockhopper ~]$ qrun 1158
[gbeane at rockhopper ~]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
1158.scyld                STDIN            gbeane          00:00:00 C batch

[gbeane at rockhopper ~]$ echo "hostname" | qsub -l
nodes=1:ppn=33,walltime=00:01:00
qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
[gbeane at rockhopper ~]$


More information about the torqueusers mailing list