[torqueusers] torque/moab holding back nodes?

Lippert, Kenneth B. Kenneth.Lippert at alcoa.com
Thu Apr 6 07:03:41 MDT 2006


 I had a similar problem on my (much smaller) cluster of 6 or so
machines with a total of 16 nodes.

Check the NODEALLOCATIONPOLICY parameter in your moab config.  I am
using Maui, but there is probably a similar parameter in Moab.  The
default was something like "MINIMIZE RESOURCES", I changed mine to
"CPULOAD" and now all nodes fill up as they should.

-k

-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Steven A.
DuChene
Sent: Wednesday, April 05, 2006 6:17 PM
To: torqueusers at supercluster.org
Subject: [torqueusers] torque/moab holding back nodes?

On our new 128 node cluster we are attempting to do some acceptance
testing
benchmarking and system burn-in. I am having issues trying to get torque
to let
me use all the available nodes. I have tried submiting a 128 node job
and I have
also tried submiting a BUNCH of smaller jobs. The 128 node job (hpl
benchmarking
job in this case) tells me "job exceeds queue resource limits" when I
try to submit
the job and submiting a large group of mixed smaller jobs (40 or so
8-way jobs
mixed with 40 2-way jobs, all hpl runs) always leaves 5 - 8 idle nodes
sitting there
with idle jobs sitting in the queue.

Our queue and scheduler configuration is a VERY simple default config so
I don't
understand why this is taking place. We have torque-2.0.0p7 and
moab-4.5.0p0
installed on this cluster.

Any ideas or suggestions as to what I should be looking at to see why
this is happening?
--
Steven A. DuChene
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list