[torqueusers] Job with high proc count will not schedule
Ken Nielson
knielson at adaptivecomputing.com
Thu Mar 4 15:07:50 MST 2010
Jonathan,
What torque calls a node and what Moab or Maui call a node are not the
same thing. Using qrun the largest number you can submit with the -l
nodes= option is five. That is because torque sees that there are only 5
nodes in the cluster. However, you can get the other cores scheduled
using the following syntax on qsub:
qsub -l nodes=5:ppn=4 <script>
This tells torque to find 5 nodes with at least 4 free processes and
schedule those resources when qrun is executed.
I hope this helps.
Ken
Jonathan K Shelley wrote:
> I have a 5 node cluster with 112 cores. I just installed torque 2.4.6.
> It seems to be working but when I submit the following.
>
> qsub -I -l nodes=32
> qsub: waiting for job 551.eos.inel.gov to start
>
> I try a qrun and I get the following:
>
> eos:/opt/torque/sbin # qrun 551
> qrun: Resource temporarily unavailable MSG=job allocation request
> exceeds currently available cluster nodes, 32 requested, 5 available
> 551.eos.inel.gov
>
> but it never schedules. I saw in the documentation that I needed to
> set the resources_availbale.nodect to a high number so I did.
>
> when I run printserverdb I get:
>
>
More information about the torqueusers
mailing list