[Mauiusers] Unable to requeue jobs with Maui/SLURM

Bjørn-Helge Mevik b.h.mevik at usit.uio.no
Mon Jun 29 05:11:00 MDT 2009


We are running Maui 3.2.6p21 on top of SLURM v. 2.0.2-0.pre1, and want a
setup where jobs with qos lowpri should be requeued when a job with a
higher priority qos preempts them.  However, we seem unable to requeue
jobs, even manually.  For instance:

2163 (0) # checkjob 333149


checking job 333149

State: Running
Creds:  user:bhm  group:users  account:staff  qos:lowpri
WallTime: 00:00:21 of 00:30:00
SubmitTime: Mon Jun 29 13:02:45
  (Time Queued  Total: 00:00:04  Eligible: 00:00:04)

StartTime: Mon Jun 29 13:02:49
StartDate: -00:00:25  Mon Jun 29 13:02:45
Total Tasks: 1

Req[0]  TaskCount: 1  Partition: normal
Network: [NONE]  Memory >= -2147483448  Disk >= 2048M  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
Dedicated Resources Per Task: PROCS: 1  MEM: 200M
NodeCount: 1
Allocated Nodes:
[compute-1-26:1]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [normal]
Flags:       PREEMPTEE
Attr:        PREEMPTEE

Reservation '333149' (-00:00:21 -> 00:29:39  Duration: 00:30:00)
PE:  1.00  StartPriority:  100

2164 (0) # mjobctl -R 333149

job 333149 cannot be preempted

SLURM's scontrol indicates that SLURM thinks the job is requeuable (and
we are able to requeue it with scontrol requeue 333149):

2165 (0) # scontrol show job 333149
JobId=333149 UserId=bhm(10231) GroupId=users(100)
   Name=libero
   Priority=100000000 Partition=normal BatchFlag=1 Reservation=(null)
   AllocNode:Sid=login-0-2:32293 TimeLimit=00:30:00 ExitCode=0:0
   JobState=RUNNING StartTime=2009-06-29T13:02:49 EndTime=2009-06-29T13:32:49
   NodeList=compute-1-26 NodeListIndices=29-29
   AllocCPUs=1
   ReqProcs=1 ReqNodes=1 ReqS:C:T=1-64.00K:1-64.00K:1-64.00K
   Shared=OK Contiguous=0 CPUs/task=1 Licenses=(null)
   MinProcs=1 MinSockets=1 MinCores=1 MinThreads=1
   MinMemoryCPU=200 MinTmpDisk=2K Features=(null)
***   Dependency=(null) Account=staff Requeue=1 Restarts=0  ***
   Reason=None Network=(null)
   ReqNodeList=(null) ReqNodeListIndices=
   ExcNodeList=(null) ExcNodeListIndices=
   SubmitTime=2009-06-29T13:02:45 SuspendTime=None PreSusTime=0
   Command=/xanadu/home/bhm/slurm/libero.sm
   WorkDir=/xanadu/home/bhm/slurm
   Comment=qos:lowpri 


The (hopefully) relevant part of maui.cfg is

------
PREEMPTPOLICY REQUEUE

# low priority things
CLASSCFG[lowpri] QDEF=lowpri QLIST=lowpri PLIST=normal
QOSCFG[lowpri] PRIORITY=0 QFLAGS=RESTARTABLE,PREEMPTEE PLIST=normal PDEF=normal

# normal priority
CLASSCFG[normal] QDEF=normal PLIST=normal
QOSCFG[normal] QFLAGS=PREEMPTOR PRIORITY=1000000
------

Is there something we are missing?

-- 
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo


More information about the mauiusers mailing list