[torqueusers] torque/moab holding back nodes?
Lippert, Kenneth B.
Kenneth.Lippert at alcoa.com
Thu Apr 6 07:03:41 MDT 2006
I had a similar problem on my (much smaller) cluster of 6 or so
machines with a total of 16 nodes.
Check the NODEALLOCATIONPOLICY parameter in your moab config. I am
using Maui, but there is probably a similar parameter in Moab. The
default was something like "MINIMIZE RESOURCES", I changed mine to
"CPULOAD" and now all nodes fill up as they should.
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Steven A.
Sent: Wednesday, April 05, 2006 6:17 PM
To: torqueusers at supercluster.org
Subject: [torqueusers] torque/moab holding back nodes?
On our new 128 node cluster we are attempting to do some acceptance
benchmarking and system burn-in. I am having issues trying to get torque
me use all the available nodes. I have tried submiting a 128 node job
and I have
also tried submiting a BUNCH of smaller jobs. The 128 node job (hpl
job in this case) tells me "job exceeds queue resource limits" when I
try to submit
the job and submiting a large group of mixed smaller jobs (40 or so
mixed with 40 2-way jobs, all hpl runs) always leaves 5 - 8 idle nodes
with idle jobs sitting in the queue.
Our queue and scheduler configuration is a VERY simple default config so
understand why this is taking place. We have torque-2.0.0p7 and
installed on this cluster.
Any ideas or suggestions as to what I should be looking at to see why
this is happening?
Steven A. DuChene
torqueusers mailing list
torqueusers at supercluster.org
More information about the torqueusers