[torqueusers] Re: [Mauiusers] node health check

Alexander Saydakov saydakov at yahoo-inc.com
Tue Nov 14 16:53:11 MST 2006


> -----Original Message-----
> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> bounces at supercluster.org] On Behalf Of 'Garrick Staples'
> Sent: Tuesday, November 14, 2006 11:52 AM
> To: torqueusers at supercluster.org
> Subject: [torqueusers] Re: [Mauiusers] node health check
> 
> In MOM's config, $down_on_error can be used to have the MOM set itself
> as "down" if there is an ERROR message from the health check script.

I would suggest taking advantage of the exit code instead of relying on the
message to begin with ERROR. What if things are so out of hand that health
script can not even execute? I understand that server can only read the
message from mom, but mom is in a better position because it has the exit
code of the script. Why not take non-zero exit code as an indication of the
problem?




More information about the torqueusers mailing list