[Mauiusers] Help configuring NODESETPOLICY on Maui 3.2.6p19.

Brad Viviano viviano at renci.org
Thu Mar 29 12:13:55 MDT 2007


Hello,
	I have searched the archive and google'd but can not seem to find the 
answer to my particular problem.  I have a 66 Node blade cluster with 
Infiniband.  Each blade is dual socket, dual core and there are 10 
blades to a blade center.  I am connecting 20 blades to a 24Port 
Infiniband Switch.  So have 4 IB Switchs, switches A, B, and C have 20 
ports used on them, and switch D has 6 ports used.  The remaining 4 
ports connect back to a 5th IB switch as the concentrator.  I am trying 
to define a POLICY such that jobs runs for 20 nodes or smaller will 
stick to a single IB switch, but partially used switches will still be 
available to be scheduled.  So I have put the following into my maui.cfg:

NODESETPOLICY ONEOF
NODESETPRIORITYTYPE BESTRESOURCE
NODESETATTRIBUTE FEATURE
NODESETDELAY 0
NODESETLIST switchA switchB switchC switchD

I am using Torque for my resource manager, and switchA, switchB, 
switchC, and switchD are setup in Torque's "nodes" file as a feature. 
If I submit three 16 node jobs, the jobs goto to nodes 0-15 (switchA), 
nodes 20-35 (switchB), and nodes 40-55 (switchC).  Which is exactly what 
I want to happen.  However, when I submit the 4th 16 nodes job run, the 
job sits in the queue till one of the first 3 finish even though there 
are still 18 nodes free.
	Looking through the Maui log file I see the following:

03/29 13:40:30 MJobSelectResourceSet(35201,1,1,SetList,NodeList,66)
03/29 13:40:30 INFO:     set[0] 0 0
03/29 13:40:30 INFO:     set[1] 1 80
03/29 13:40:30 INFO:     set[2] 2 80
03/29 13:40:30 INFO:     set[3] 3 80
03/29 13:40:30 INFO:     set[4] 4 24
03/29 13:40:30 INFO:     240 feasible tasks found for job 35201:0 in 
partition DEFAULT (64 Needed)
03/29 13:40:30 MJobGetINL(35201,FNL,INL,DEFAULT,NodeCount,TaskCount)

This to me says its seeing my 4 entries from my nodeset lists. (4 
processors per blade * 20 blades per switch).  However it is only 
getting "240 feasible tasks found" instead of 266 (4 processors * 66 
nodes), which means it isn't picking up the 24 processors available on 
switchD.  Farther down in the log file I see:

03/29 13:40:30 INFO:     idle resources (48 tasks/12 nodes) found with 
feasible
list specified
03/29 13:40:30 INFO:     insufficient idle tasks in partition DEFAULT 
for 35201:0: (48 of 64 available)

The other thing is.  If I submit 16 nodes, 16 nodes, 16 nodes, 12 nodes 
it will run immediately.  If I submit 16, 16, 16, 12, and 6 it will run 
immediately.  For some reason "switchD" isn't getting lumped into the 
other switches as available nodes for the job.  I am guessing its a 
weighting issue because of the difference between "80" prcoessors on A, 
B, C and "24" on D:

3/29 13:40:30 INFO:      set[1] 1 80
03/29 13:40:30 INFO:     set[2] 2 80
03/29 13:40:30 INFO:     set[3] 3 80
03/29 13:40:30 INFO:     set[4] 4 24

Anyone have any suggestions on how I would correctly compensate for this 
in maui.cfg?

	Thanks,
		-Brad Viviano


More information about the mauiusers mailing list