checkjob
checkjob [ARGS] <JOBID>
Purpose
Display detailed job state information and diagnostic
output for specified job.
Permissions
This command can be run by any Maui admininstrator.
Additionally, valid users may use this command to obtain information about
their own jobs.
| Args |
Details |
| -A |
provide output in the form of parsable Attribute-Value pairs |
| -h |
display command usage help |
| -l <POLICYLEVEL> |
check job start eligibility subject to specified throttling policy
level. <POLICYLEVEL> can be one of HARD, SOFT, or
OFF |
| -r <RESID> |
check job access to specified reservation |
| -v |
display verbose job state and eligibility information |
Description
This command allows any Maui administrator to check
the detailed status and resources requirements of a job. Additionally,
this command performs numerous diagnostic checks and determines if and
where the could potentially run. Diagnostic checks include policy
violations (See the Throttling Policy Overview
for details), reservation constraints, and job to resource mapping.
If a job cannot run, a text reason is provided along with a summary of
how many nodes are and are not available. If the -v flag is
specified, a node by node summary of resource availability will be displayed
for idle jobs.
If a job cannot run, one of the following reasons
will be given:
| Reason |
Description |
| job has hold in place |
one or more job holds are currently in place |
| insufficient idle procs |
|
| idle procs do not meet requirements |
adequate idle processors are available but these do not meet job requirements |
| start date not reached |
job has specified a minimum 'start date' which is still in the future |
| expected state is not idle |
job is in an unexpected state |
| state is not idle |
job is not in the idle state |
| dependency is not met |
job depends on another job reaching a certain state |
| rejected by policy |
job start is prevented by a throttling policy |
If a job cannot run on a particular node, one of
the following 'per node' reasons will be given:
| Class |
Node does not allow required job class/queue |
| CPU |
Node does not possess required processors |
| Disk |
Node does not possess required local disk |
| Features |
Node does not possess required node features |
| Memory |
Node does not possess required real memory |
| Network |
Node does not possess required network interface |
| State |
Node is not Idle or Running |
The checkjob command displays the following job attributes:
| Attribute |
Value |
Description |
| Account |
<STRING> |
Name of account associated with job |
| Actual Run Time |
[[[DD:]HH:]MM:]SS |
Length of time job actually ran. NOTE: This info
only display in simulation mode. |
| Arch |
<STRING> |
Node architecture required by job |
| Class |
[<CLASS NAME> <CLASS COUNT>] |
Name of class/queue required by job and number of class initiators
required per task. |
| Dedicated Resources Per Task |
<XXX> |
|
| Disk |
<INTEGER> |
Amount of local disk required by job (in MB) |
| Exec Size |
<INTEGER> |
Size of job executable (in MB) |
| Executable |
<STRING> |
Name of job executable |
| Features |
Square bracket delimited list of <STRING>s |
Node features required by job |
| Group |
<STRING> |
Name of UNIX group associated with job |
| Holds |
Zero of more of User, System, and Batch |
Types of job holds currently applied to job |
| Image Size |
<INTEGER> |
Size of job data (in MB) |
| Memory |
<INTEGER> |
Amount of real memory required per node (in MB) |
| Network |
<STRING> |
Type of network adapter required by job |
| Nodecount |
<INTEGER> |
Number of nodes required by job |
| Opsys |
<STRING> |
Node operating system required by job |
| Partition Mask |
ALL or colon delimited list of partitions |
List of partitions the job has access to |
| PE |
<FLOAT> |
Number of processor-equivalents requested by job |
| QOS |
<STRING> |
Quality of Service associated with job |
| QueueTime |
<TIME> |
Time job was submitted to resource management system |
| StartCount |
<INTEGER> |
Number of times job has been started by Maui |
| StartPriority |
<INTEGER> |
Start priority of job |
| State |
One of Idle, Starting, Running, etc |
Current Job State |
| Total Tasks |
<INTEGER> |
Number of tasks requested by job |
| User |
<STRING> |
Name of user submitting job |
| WallTime: |
[[[DD:]HH:]MM:]SS |
Length of time job has been running |
| WallTime Limit: |
[[[DD:]HH:]MM:]SS |
Maximum walltime allowed to job |
In the above table, fields marked with an asterisk (*) are
only displayed when set or when the -v flag is specified.
Examples
Example 1
----
> checkjob -v job05
checking job job05
State: Idle (User: john Group: staff
Account: [NONE])
WallTime: 0:00:00 (Limit: 6:00:00)
Submission Time: Mon Mar 2 06:34:04
Total Tasks: 2
Req[0] TaskCount: 2 Partition: ALL
Network: hps_user Memory >= 0 Disk >=
0 Features: [NONE]
Opsys: AIX43 Arch: R6000 Class: [batch
1]
ExecSize: 0 ImageSize: 0
Dedicated Resources Per Task: Procs: 1
NodeCount: 0
IWD: [NONE] Executable:
cmd
QOS: [DEFAULT] Bypass: 0 StartCount:
0
Partition Mask: ALL
Holds: Batch
batch hold reason: Admin
PE: 2.00 StartPriority: 1
job cannot run (job has hold in place)
job cannot run (insufficient idle procs:
0 available)
----
Note that the example job cannot be started for two different reasons.
-
It has a batch hold in place.
-
There are no idle resources currently available
See also:
diagnose -j - display
additional detailed information regarding jobs
|