[Mauiusers] Odd Slurm/Maui interaction problem
Tim Carlson
tim.s.carlson at gmail.com
Tue Aug 5 14:46:04 MDT 2008
Cross posting to mauiusers and slurm-dev because I'm not sure where
the problem actually lies.
I am running Slurm-1.3.4 and maui 3.2.6p21 on a Rocks 5 (x86_64)
cluster. The setup is very simplistic with one partition.
The problem appears when I try to submit a job with sbatch and specify
a begin time in the future.
For example if I submit the following job at 13:35.
#SBATCH -t 10
#SBATCH -N 1
#SBATCH --begin=13:40
date
Maui will shows this job initially in a deferred state.
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
854 tim Deferred 1 00:10:00 Tue Aug 5 13:35:14
But the reason for being in a deferred state seems strange. Here is
the output of "checkjob 854". You'll see an RMFailure().
--------------------------------------------------------
State: Idle EState: Deferred
Creds: user:tim group:users account:mscfcans qos:DEFAULT
WallTime: 00:00:00 of 00:10:00
SubmitTime: Tue Aug 5 13:35:14
(Time Queued Total: 00:01:36 Eligible: 00:00:12)
StartDate: -00:01:23 Tue Aug 5 13:35:27
Total Tasks: 1
Req[0] TaskCount: 1 Partition: ALL
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
NodeCount: 1
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 1
PartitionMask: [maui]
job is deferred. Reason: RMFailure ()
Holds: Defer (hold reason: RMFailure)
PE: 1.00 StartPriority: 199797
cannot select job 854 for partition maui (job hold active)
--------------------------------------------------------
Now I wait until 13:45 to run checkjob again, and I get the same
output. Notably an RMFailure message.
Has anyone seen something like this before? Any problems with using
"--begin" in sbatch while using Maui for the scheduler?
Thanks!
More information about the mauiusers
mailing list