Grid Diagnostics and Validation
Moab Workload Manager® for Grids

17.18 Grid Diagnostics and Validation

17.8.1 Peer Management Overview

  • Use mdiag -R to view interface health and performance/usage statistics.
  • Use mrmctl to enable/disable peer interfaces.
  • Use mrmctl -m to dynamically modify/configure peer interfaces.
  • Use mdiag -x to view general grid configuration and system diagnostics.

17.8.2 Peer Diagnostic Overview

  • Use mdiag -R to diagnose general RM interfaces.
  • Use mdiag -S to diagnose general scheduler health.
  • Use mdiag -R -V job <RMID> to diagnose peer-to-peer job migration.
  • Example:
    > mdiag -R -V job peer1
    
  • Use mdiag -R -V data <RMID> to diagnose peer-to-peer data staging.
  • Use mdiag -R -V cred <RMID> to diagnose peer-to-peer credential mapping.

17.18.3 Peer-to-Peer Communication Issues

If communication between peers is failing, generally the first step is run the command mdiag -R. This command will report general resource manager state, as well as indication of detected configuration issues and failures with specific commands. Among the failures reported can be the following:


Message

WARNING: Client not yet initialized.
Description Before peer communication can be established, an initialize message must be exchanged. If this message is displayed, a peer has attempted to communicate without first sending this initialization message.
Recommended Actions restart the peer

Message

WARNING: client id 'XXX' is unknown'
ERROR: command 'XXX' cannot be executed from host 'XXX'
ERROR: user 'XXX' is not authorized to run command 'XXX'
Description A peer has attempted to communicate whose identity cannot be authenticated. Either the peer is unknown or the peer is known and the secret key is invalid.
Recommended Actions Verify the requesting client should be attempting to communicate and if so, verify that a correct corresponding entry exists in the moab-private.cfg file.


If these commands do not provide adequate detail, additional communication diagnostics can be enabled by setting the DROPBADREQUEST client attribute to FALSE as in the following example:

moab-private.cfg
CLIENTCFG[RM:orion] DROPBADREQUEST=FALSE