[Mauiusers] Error: Standing Reservation cannot be created

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Tue Sep 4 07:07:24 MDT 2007


We're trying to set up a new Standing Reservation in the Maui maui.cfg file so 
that a set of newly installed nodes should be reserved for a small group of
test users.

We have an old SR that works perfectly, but the new SR named "switch5" cannot
be created as shown in the maui.log:

...
09/04 14:30:43 INFO:     MNode[083] 'q083' added to regex list
09/04 14:30:43 INFO:     MNode[084] 'q084' added to regex list
09/04 14:30:43 MSRSetRes(switch5,1,0)
09/04 14:30:43 MJobSetCreds(switch5.0,[ALL],[ALL],[ALL])
09/04 14:30:43 MSRGetAttributes(switch5,0,Start,Duration)
09/04 14:30:43 INFO:     attempting standing reservation of 336 procs in 
-INFINITY for   INFINITY
09/04 14:30:43 
MSRSelectNodeList(switch5.0,switch5,DstNL,NodeCount,00:00:00,ReqNL,12)
09/04 14:30:43 INFO:     0 feasible tasks found for job switch5.0:0 in partition 
DEFAULT (1 Needed)
09/04 14:30:43 ALERT:    cannot select 336 procs in partition '[ALL]' for SR 
'switch5'
09/04 14:30:43 MSRSetRes(switch5,1,1)
09/04 14:30:43 MJobSetCreds(switch5.1,[ALL],[ALL],[ALL])
09/04 14:30:43 MSRGetAttributes(switch5,1,Start,Duration)
09/04 14:30:43 INFO:     reservation not required for specified period
09/04 14:30:43 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
...

Apparently the 84 nodes (4 CPUs each) are located correctly, but the reason
for the above ALERT message is incomprehensible !  The net result is that
the configured SR isn't working, and the new nodes run production jobs
that shouldn't land on these nodes.  This is a big problem for us :-(
I looked into the code in src/moab/MJob.c without gaining any understanding
of the problem (my fault, of course :-).

Question: Can anyone point to what's wrong with our SR's or with Maui itself ?

FYI, we run Torque 2.1.8 and Maui 3.2.6p20.  This is an excerpt from our maui.cfg:

NODESETPOLICY           ONEOF
NODESETATTRIBUTE        FEATURE
NODESETDELAY            1
NODESETLIST             switch1 switch2 switch3 switch4 switch5 infiniband
NODESETPRIORITYTYPE     BESTFIT
# Reservation of the nodes p0XX with Infiniband
SRCFG[infiniband]       HOSTLIST=p0[012][0-9]
SRCFG[infiniband] 
USERLIST=jensj,bligaard,ohnielse,moses,efernand,studt,ibensig,dc
SRCFG[infiniband]       PERIOD=INFINITY
SRCFG[infiniband]       NODEFEATURES=infiniband
# Testing of new nodes q0XX
SRCFG[switch5]       HOSTLIST=q0[0-9][0-9]
SRCFG[switch5]       USERLIST=jensj,dulak,ohnielse
SRCFG[switch5]       PERIOD=INFINITY
SRCFG[switch5]       NODEFEATURES=switch5


Thanks a lot,
Ole

-- 
Ole Holm Nielsen
Department of Physics, Technical University of Denmark


More information about the mauiusers mailing list