[Mauiusers] Error: Standing Reservation cannot be created
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue Sep 4 07:07:24 MDT 2007
We're trying to set up a new Standing Reservation in the Maui maui.cfg file so
that a set of newly installed nodes should be reserved for a small group of
test users.
We have an old SR that works perfectly, but the new SR named "switch5" cannot
be created as shown in the maui.log:
...
09/04 14:30:43 INFO: MNode[083] 'q083' added to regex list
09/04 14:30:43 INFO: MNode[084] 'q084' added to regex list
09/04 14:30:43 MSRSetRes(switch5,1,0)
09/04 14:30:43 MJobSetCreds(switch5.0,[ALL],[ALL],[ALL])
09/04 14:30:43 MSRGetAttributes(switch5,0,Start,Duration)
09/04 14:30:43 INFO: attempting standing reservation of 336 procs in
-INFINITY for INFINITY
09/04 14:30:43
MSRSelectNodeList(switch5.0,switch5,DstNL,NodeCount,00:00:00,ReqNL,12)
09/04 14:30:43 INFO: 0 feasible tasks found for job switch5.0:0 in partition
DEFAULT (1 Needed)
09/04 14:30:43 ALERT: cannot select 336 procs in partition '[ALL]' for SR
'switch5'
09/04 14:30:43 MSRSetRes(switch5,1,1)
09/04 14:30:43 MJobSetCreds(switch5.1,[ALL],[ALL],[ALL])
09/04 14:30:43 MSRGetAttributes(switch5,1,Start,Duration)
09/04 14:30:43 INFO: reservation not required for specified period
09/04 14:30:43 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
...
Apparently the 84 nodes (4 CPUs each) are located correctly, but the reason
for the above ALERT message is incomprehensible ! The net result is that
the configured SR isn't working, and the new nodes run production jobs
that shouldn't land on these nodes. This is a big problem for us :-(
I looked into the code in src/moab/MJob.c without gaining any understanding
of the problem (my fault, of course :-).
Question: Can anyone point to what's wrong with our SR's or with Maui itself ?
FYI, we run Torque 2.1.8 and Maui 3.2.6p20. This is an excerpt from our maui.cfg:
NODESETPOLICY ONEOF
NODESETATTRIBUTE FEATURE
NODESETDELAY 1
NODESETLIST switch1 switch2 switch3 switch4 switch5 infiniband
NODESETPRIORITYTYPE BESTFIT
# Reservation of the nodes p0XX with Infiniband
SRCFG[infiniband] HOSTLIST=p0[012][0-9]
SRCFG[infiniband]
USERLIST=jensj,bligaard,ohnielse,moses,efernand,studt,ibensig,dc
SRCFG[infiniband] PERIOD=INFINITY
SRCFG[infiniband] NODEFEATURES=infiniband
# Testing of new nodes q0XX
SRCFG[switch5] HOSTLIST=q0[0-9][0-9]
SRCFG[switch5] USERLIST=jensj,dulak,ohnielse
SRCFG[switch5] PERIOD=INFINITY
SRCFG[switch5] NODEFEATURES=switch5
Thanks a lot,
Ole
--
Ole Holm Nielsen
Department of Physics, Technical University of Denmark
More information about the mauiusers
mailing list