[Mauiusers] Conflict between standing reservations and suspended jobs

Nate Crawford nathan.crawford at chemie.uni-karlsruhe.de
Mon Mar 6 12:48:16 MST 2006


I've been having some pretty serious difficulties with Maui
(3.2.6p14-snap.1138394201) and Torque (2.0.0p8) on a small Opteron
cluster (SuSE 9.3).  In general, when a low-priority job gets suspended
through preemption by a high-priority job, the suspended job is ignored
when SPACEFLEX standing reservations are calculated in later Maui
cycles.

  As an example, suppose there are two SPACEFLEX reservations, each
taking up an entire dual-processor node, and three types of jobs:

A) preemptee, can run in reservation 1
B) neither preemptor nor preemptee, can run in reservation 1 and 2
C) preemptor, can run in reservation 1 and 2

At time 0, a node is running job A and job B, with reservation 1
assigned to it.

At time 1, job C has gained sufficient priority points to suspend job A.
At that point, or shortly thereafter, the node is assigned to both
reservation 1 and 2, which should not be possible.

At time 2, the node has job A suspended, with B and C running, but now
only has reservation 2 assigned.  Job A is now locked into a node that
is not reserved for it.

At time 3, another job of type B is queued, and is scheduled to start
when job C ends.  This effectively continues the preemption of job A
even though B is not normally able to suspend A.


  How do I fix this?  

Here's the relevant part of my maui.cfg.
  Job A is group ck with class prefinity.
  Job B is group cb with class long.
  Job C is group ck with class short.

QOSCFG[high]  QFLAGS=PREEMPTOR
QOSCFG[med]
QOSCFG[low] QFLAGS=PREEMPTEE

CLASSCFG[infinity]      QDEF=med
CLASSCFG[verylong]      QDEF=med
CLASSCFG[long]          QDEF=med
CLASSCFG[medium]        WCOVERRUN=00:30:00  QDEF=high QLIST=high^
CLASSCFG[short]         WCOVERRUN=00:05:00  QDEF=high QLIST=high^
CLASSCFG[prefinity]     MAX.PROC=1 QDEF=low QLIST=low^

#reservation 1
SRCFG[dayjobs] STARTTIME=8:00:00 ENDTIME=18:00:00
SRCFG[dayjobs] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI DEPTH=3
SRCFG[dayjobs] FLAGS=SPACEFLEX
SRCFG[dayjobs] CLASSLIST=short-,medium-,prefinity+
SRCFG[dayjobs] GROUPLIST=cb-
SRCFG[dayjobs] TASKCOUNT=4 RESOURCES=PROCS:1;MEM:3750
SRCFG[dayjobs] TPN=2
SRCFG[dayjobs] ACCESS=DEDICATED

# Reservation 2
SRCFG[anorgjobs] STARTTIME=00:00:00 ENDTIME=00:00:00
SRCFG[anorgjobs] PERIOD=DAY DAYS=ALL DEPTH=2
SRCFG[anorgjobs] OWNER=GROUP:cb
SRCFG[anorgjobs] FLAGS=SPACEFLEX
SRCFG[anorgjobs] CLASSLIST=short-,medium-
SRCFG[anorgjobs] GROUPLIST=cb+
SRCFG[anorgjobs] TASKCOUNT=4 RESOURCES=PROCS:1;MEM:3750
SRCFG[anorgjobs] TPN=2
SRCFG[anorgjobs] ACCESS=DEDICATED


Thanks for any help,
Nate

__________________________________
Dr. Nathan Crawford
Theoretische Chemie
Universität Karlsruhe

nathan.crawford at chemie.uni-karlsruhe.de



More information about the mauiusers mailing list