[Mauiusers] Dependency jobs get system hold

Ronny T. Lampert telecaadmin at uni.de
Thu Aug 17 04:04:53 MDT 2006


using torque 2.1.2 + maui-3.2.6p16 jobs having dependencies suddenly get a
system hold which can be confusing for the administrator.
Please consider the following 2 outputs from qstat and checkjob.

#> checkjob 350236
checking job 350236

State: Hold
Creds:  user:USER  group:GROUP  class:default  qos:low
WallTime: 00:00:00 of 99:23:59:59
SubmitTime: Thu Aug 17 11:33:07
  (Time Queued  Total: 00:16:36  Eligible: 00:00:00)

Total Tasks: 1

Req[0]  TaskCount: 1  Partition: ALL
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]

IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 0
PartitionMask: [ALL]
Attr:        PREEMPTEE

PE:  1.00  StartPriority:  1
cannot select job 350236 for partition DEFAULT (non-idle state 'Hold')

#> qstat -f 350236

Job Id: 350236.SERVER
    Job_Name = e00018
    job_state = H
    queue = default
    server = SERVER
    Checkpoint = u
    ctime = Thu Aug 17 11:33:07 2006
    depend = afterany:350235.SERVER at SERVER
    Hold_Types = s

If I look at checkjob I realize that something is wrong with the job,
because it is in HOLD state.
Then I look at the Hold_Types in qstat and see: "SYSTEM HOLD" and conclude,
something has gone wrong. If I overlook the "depend=" line...

Now some questions:

1) Do these jobs follow the usual DEFER-routines with retry and DEFERTIME
checking? Or does maui magically know that this is NOT a deferred job?
*I* would think it is one.

2) I think a USER hold would be much more to the point. Or a new type,

3) Could this somehow be made more clear to the administrator? Would be
great if the checkjob just said

"cannot select job 350236 for partition DEFAULT (non-idle state 'Hold') -
5 of 10 job-dependencies not fulfilled" or something.

That would prevent me (and others?) from wondering and also, from having to
manually use qstat AND checkjob.


More information about the mauiusers mailing list