[torqueusers] RE: Health check script failure and offlining
Smith, Jerry Don II
jdsmit at sandia.gov
Sat Dec 3 12:53:55 MST 2005
We wrote a cron script tht takes care of this. But yes MOAB takes care of this all on its own, even allowing "triggers" to adjust many things (node state, reservations etc...).
I have set up a health check script in $PBS/mom_priv/config. It works
fine in that it sets the 'message' attribute for the problem mom/node when
there is a failure, but how can I get the nodes status adjusted to
(pbsnodes -o nodeXXX) when the failure occurs. The manual says that:
"Cluster schedulers can be configured to adjust a given node's state
based on this [ERROR message] information."
Perhaps this is only a MOAB feature.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 2962 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051203/dd0a4796/attachment.bin
More information about the torqueusers