[Mauiusers] Error: Standing Reservation cannot be created

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Tue Sep 4 07:07:24 MDT 2007

We're trying to set up a new Standing Reservation in the Maui maui.cfg file so 
that a set of newly installed nodes should be reserved for a small group of
test users.

We have an old SR that works perfectly, but the new SR named "switch5" cannot
be created as shown in the maui.log:

09/04 14:30:43 INFO:     MNode[083] 'q083' added to regex list
09/04 14:30:43 INFO:     MNode[084] 'q084' added to regex list
09/04 14:30:43 MSRSetRes(switch5,1,0)
09/04 14:30:43 MJobSetCreds(switch5.0,[ALL],[ALL],[ALL])
09/04 14:30:43 MSRGetAttributes(switch5,0,Start,Duration)
09/04 14:30:43 INFO:     attempting standing reservation of 336 procs in 
09/04 14:30:43 
09/04 14:30:43 INFO:     0 feasible tasks found for job switch5.0:0 in partition 
DEFAULT (1 Needed)
09/04 14:30:43 ALERT:    cannot select 336 procs in partition '[ALL]' for SR 
09/04 14:30:43 MSRSetRes(switch5,1,1)
09/04 14:30:43 MJobSetCreds(switch5.1,[ALL],[ALL],[ALL])
09/04 14:30:43 MSRGetAttributes(switch5,1,Start,Duration)
09/04 14:30:43 INFO:     reservation not required for specified period
09/04 14:30:43 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)

Apparently the 84 nodes (4 CPUs each) are located correctly, but the reason
for the above ALERT message is incomprehensible !  The net result is that
the configured SR isn't working, and the new nodes run production jobs
that shouldn't land on these nodes.  This is a big problem for us :-(
I looked into the code in src/moab/MJob.c without gaining any understanding
of the problem (my fault, of course :-).

Question: Can anyone point to what's wrong with our SR's or with Maui itself ?

FYI, we run Torque 2.1.8 and Maui 3.2.6p20.  This is an excerpt from our maui.cfg:

NODESETDELAY            1
NODESETLIST             switch1 switch2 switch3 switch4 switch5 infiniband
# Reservation of the nodes p0XX with Infiniband
SRCFG[infiniband]       HOSTLIST=p0[012][0-9]
SRCFG[infiniband]       PERIOD=INFINITY
SRCFG[infiniband]       NODEFEATURES=infiniband
# Testing of new nodes q0XX
SRCFG[switch5]       HOSTLIST=q0[0-9][0-9]
SRCFG[switch5]       USERLIST=jensj,dulak,ohnielse
SRCFG[switch5]       NODEFEATURES=switch5

Thanks a lot,

Ole Holm Nielsen
Department of Physics, Technical University of Denmark

