[an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive]

checkjob


checkjob  [ARGS]  <JOBID>
 

Purpose

    Display detailed job state information and diagnostic output for specified job.
 

Permissions

    This command can be run by any Maui admininstrator.  Additionally, valid users may use this command to obtain information about their own jobs.
 
Args Details
-A provide output in the form of parsable Attribute-Value pairs
-h display command usage help
-l <POLICYLEVEL> check job start eligibility subject to specified throttling policy level.  <POLICYLEVEL> can be one of HARD, SOFT, or OFF
-r <RESID> check job access to specified reservation
-v display verbose job state and eligibility information

Description

    This command allows any Maui administrator to check the detailed status and resources requirements of a job.  Additionally, this command performs numerous diagnostic checks and determines if and where the could potentially run.  Diagnostic checks include policy violations (See the Throttling Policy Overview for details), reservation constraints, and job to resource mapping.  If a job cannot run, a text reason is provided along with a summary of how many nodes are and are not available.  If the -v flag is specified, a node by node summary of resource availability will be displayed for idle jobs.

    If a job cannot run, one of the following reasons will be given:
 
Reason Description
job has hold in place one or more job holds are currently in place
insufficient idle procs
idle procs do not meet requirements adequate idle processors are available but these do not meet job requirements
start date not reached job has specified a minimum 'start date' which is still in the future
expected state is not idle job is in an unexpected state
state is not idle job is not in the idle state
dependency is not met job depends on another job reaching a certain state
rejected by policy job start is prevented by a throttling policy 

    If a job cannot run on a particular node, one of the following 'per node' reasons will be given:
 
Class Node does not allow required job class/queue
CPU Node does not possess required processors
Disk Node does not possess required local disk
Features Node does not possess required node features
Memory Node does not possess required real memory
Network Node does not possess required network interface 
State Node is not Idle or Running

The checkjob command displays the following job attributes:
 
Attribute Value Description
Account <STRING> Name of account associated with job
Actual Run Time [[[DD:]HH:]MM:]SS Length of time job actually ran.  NOTE:  This info only display in simulation mode.
Arch <STRING> Node architecture required by job
Class [<CLASS NAME> <CLASS COUNT>] Name of class/queue required by job and number of class initiators required per task.
Dedicated Resources Per Task <XXX>
Disk <INTEGER> Amount of local disk required by job (in MB)
Exec Size <INTEGER> Size of job executable (in MB)
Executable <STRING> Name of job executable
Features Square bracket delimited list of <STRING>s Node features required by job
Group <STRING> Name of UNIX group associated with job
Holds Zero of more of User, System, and Batch Types of job holds currently applied to job
Image Size <INTEGER> Size of job data (in MB)
Memory <INTEGER> Amount of real memory required per node (in MB)
Network <STRING> Type of network adapter required by job
Nodecount <INTEGER> Number of nodes required by job
Opsys <STRING> Node operating system required by job
Partition Mask ALL or colon delimited list of partitions List of partitions the job has access to
PE <FLOAT> Number of processor-equivalents requested by job
QOS <STRING> Quality of Service associated with job
QueueTime <TIME> Time job was submitted to resource management system
StartCount <INTEGER> Number of times job has been started by Maui
StartPriority <INTEGER> Start priority of job
State One of Idle, Starting, Running, etc Current Job State
Total Tasks <INTEGER> Number of tasks requested by job
User <STRING> Name of user submitting job
WallTime: [[[DD:]HH:]MM:]SS Length of time job has been running
WallTime Limit: [[[DD:]HH:]MM:]SS Maximum walltime allowed to job
In the above table, fields marked with an asterisk (*) are only displayed when set or when the -v flag is specified.
 

Examples

Example 1

----
> checkjob -v job05

checking job job05

State: Idle  (User: john  Group: staff  Account: [NONE])
WallTime: 0:00:00  (Limit: 6:00:00)

Submission Time: Mon Mar  2 06:34:04

Total Tasks: 2

Req[0]  TaskCount: 2  Partition: ALL
Network: hps_user  Memory >= 0  Disk >= 0  Features: [NONE]
Opsys: AIX43  Arch: R6000  Class: [batch 1]
ExecSize: 0  ImageSize: 0
Dedicated Resources Per Task: Procs: 1
NodeCount: 0

IWD: [NONE]     Executable:  cmd
QOS: [DEFAULT]  Bypass: 0  StartCount: 0
Partition Mask: ALL
Holds:    Batch
batch hold reason:  Admin
PE:  2.00  StartPriority:  1
job cannot run  (job has hold in place)
job cannot run  (insufficient idle procs:  0 available)
----

Note that the example job cannot be started for two different reasons.

See also:

    diagnose -j - display additional detailed information regarding jobs
[an error occurred while processing this directive] [an error occurred while processing this directive]