[torqueusers] Job with high proc count will not schedule

Ken Nielson knielson at adaptivecomputing.com
Thu Mar 4 15:07:50 MST 2010


Jonathan,

What torque calls a node and what Moab or Maui call a node are not the 
same thing. Using qrun the largest number you can submit with the -l 
nodes= option is five. That is because torque sees that there are only 5 
nodes in the cluster. However, you can get the other cores scheduled 
using the following syntax on qsub:

qsub -l nodes=5:ppn=4 <script>

This tells torque to find 5 nodes with at least 4 free processes and 
schedule those resources when qrun is executed.

I hope this helps.

Ken


Jonathan K Shelley wrote:
> I have a 5 node cluster with 112 cores. I just installed torque 2.4.6. 
> It seems to be working but when I submit the following.
>
> qsub -I -l nodes=32
> qsub: waiting for job 551.eos.inel.gov to start
>
> I try a qrun and I get the following:
>
> eos:/opt/torque/sbin # qrun 551
> qrun: Resource temporarily unavailable MSG=job allocation request 
> exceeds currently available cluster nodes, 32 requested, 5 available 
> 551.eos.inel.gov
>
> but it never schedules. I saw in the documentation that I needed to 
> set the resources_availbale.nodect to a high number so I did.
>
> when I run printserverdb I get:
>
>   



More information about the torqueusers mailing list