[Mauiusers] Odd Slurm/Maui interaction problem

Tim Carlson tim.s.carlson at gmail.com
Tue Aug 5 14:46:04 MDT 2008


Cross posting to mauiusers and slurm-dev because I'm not sure where
the problem actually lies.

I am running Slurm-1.3.4 and maui 3.2.6p21 on a Rocks 5 (x86_64)
cluster. The setup is very simplistic with one partition.

The problem appears when I try to submit a job with sbatch and specify
a begin time in the future.

For example if I submit the following job at 13:35.

#SBATCH -t 10
#SBATCH -N 1
#SBATCH --begin=13:40
date

Maui will shows this job initially in a deferred state.

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME

854                     tim   Deferred     1    00:10:00  Tue Aug  5 13:35:14

But the reason for being in a deferred state seems strange. Here is
the output of "checkjob 854". You'll see an RMFailure().

--------------------------------------------------------
State: Idle  EState: Deferred
Creds:  user:tim  group:users  account:mscfcans  qos:DEFAULT
WallTime: 00:00:00 of 00:10:00
SubmitTime: Tue Aug  5 13:35:14
  (Time Queued  Total: 00:01:36  Eligible: 00:00:12)

StartDate: -00:01:23  Tue Aug  5 13:35:27
Total Tasks: 1

Req[0]  TaskCount: 1  Partition: ALL
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
NodeCount: 1


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [maui]
job is deferred.  Reason:  RMFailure  ()
Holds:    Defer  (hold reason:  RMFailure)
PE:  1.00  StartPriority:  199797
cannot select job 854 for partition maui (job hold active)
--------------------------------------------------------

Now I wait until 13:45 to run checkjob again, and I get the same
output. Notably an RMFailure message.

Has anyone seen something like this before? Any problems with using
"--begin" in sbatch while using Maui for the scheduler?

Thanks!


More information about the mauiusers mailing list