[torqueusers] RE: Health check script failure and offlining

Smith, Jerry Don II jdsmit at sandia.gov
Sat Dec 3 12:53:55 MST 2005


We wrote a cron script tht takes care of this.  But yes MOAB takes care of this all on its own, even allowing "triggers" to adjust many things (node state, reservations etc...).



I have set up a health check script in $PBS/mom_priv/config.  It works
fine in that it sets the 'message' attribute for the problem mom/node when
there is a failure, but how can I get the nodes status adjusted to
(pbsnodes -o nodeXXX) when the failure occurs.  The manual says that:

  "Cluster schedulers can be configured to adjust a given node's state
   based on this [ERROR message] information."

Perhaps this is only a MOAB feature.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 2962 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051203/dd0a4796/attachment.bin

More information about the torqueusers mailing list