[Mauiusers] Unable to requeue jobs with Maui/SLURM
Bjørn-Helge Mevik
b.h.mevik at usit.uio.no
Mon Jun 29 05:11:00 MDT 2009
We are running Maui 3.2.6p21 on top of SLURM v. 2.0.2-0.pre1, and want a
setup where jobs with qos lowpri should be requeued when a job with a
higher priority qos preempts them. However, we seem unable to requeue
jobs, even manually. For instance:
2163 (0) # checkjob 333149
checking job 333149
State: Running
Creds: user:bhm group:users account:staff qos:lowpri
WallTime: 00:00:21 of 00:30:00
SubmitTime: Mon Jun 29 13:02:45
(Time Queued Total: 00:00:04 Eligible: 00:00:04)
StartTime: Mon Jun 29 13:02:49
StartDate: -00:00:25 Mon Jun 29 13:02:45
Total Tasks: 1
Req[0] TaskCount: 1 Partition: normal
Network: [NONE] Memory >= -2147483448 Disk >= 2048M Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
Dedicated Resources Per Task: PROCS: 1 MEM: 200M
NodeCount: 1
Allocated Nodes:
[compute-1-26:1]
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 1
PartitionMask: [normal]
Flags: PREEMPTEE
Attr: PREEMPTEE
Reservation '333149' (-00:00:21 -> 00:29:39 Duration: 00:30:00)
PE: 1.00 StartPriority: 100
2164 (0) # mjobctl -R 333149
job 333149 cannot be preempted
SLURM's scontrol indicates that SLURM thinks the job is requeuable (and
we are able to requeue it with scontrol requeue 333149):
2165 (0) # scontrol show job 333149
JobId=333149 UserId=bhm(10231) GroupId=users(100)
Name=libero
Priority=100000000 Partition=normal BatchFlag=1 Reservation=(null)
AllocNode:Sid=login-0-2:32293 TimeLimit=00:30:00 ExitCode=0:0
JobState=RUNNING StartTime=2009-06-29T13:02:49 EndTime=2009-06-29T13:32:49
NodeList=compute-1-26 NodeListIndices=29-29
AllocCPUs=1
ReqProcs=1 ReqNodes=1 ReqS:C:T=1-64.00K:1-64.00K:1-64.00K
Shared=OK Contiguous=0 CPUs/task=1 Licenses=(null)
MinProcs=1 MinSockets=1 MinCores=1 MinThreads=1
MinMemoryCPU=200 MinTmpDisk=2K Features=(null)
*** Dependency=(null) Account=staff Requeue=1 Restarts=0 ***
Reason=None Network=(null)
ReqNodeList=(null) ReqNodeListIndices=
ExcNodeList=(null) ExcNodeListIndices=
SubmitTime=2009-06-29T13:02:45 SuspendTime=None PreSusTime=0
Command=/xanadu/home/bhm/slurm/libero.sm
WorkDir=/xanadu/home/bhm/slurm
Comment=qos:lowpri
The (hopefully) relevant part of maui.cfg is
------
PREEMPTPOLICY REQUEUE
# low priority things
CLASSCFG[lowpri] QDEF=lowpri QLIST=lowpri PLIST=normal
QOSCFG[lowpri] PRIORITY=0 QFLAGS=RESTARTABLE,PREEMPTEE PLIST=normal PDEF=normal
# normal priority
CLASSCFG[normal] QDEF=normal PLIST=normal
QOSCFG[normal] QFLAGS=PREEMPTOR PRIORITY=1000000
------
Is there something we are missing?
--
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo
More information about the mauiusers
mailing list