[torqueusers] Adding more system stats.
Roy Dragseth
roy.dragseth at cc.uit.no
Wed Oct 20 03:07:18 MDT 2010
I'm trying to create a simple (or as simple as possible) tool that add more
system stats in the node report and is wondering what would be the best way to
do this.
1. Add a new parameter in the mom config:
systat=/path/to/script.sh
and let pbsnodes report whatever output comes from script.sh as a string
something like
systat=iboutkbs=XXX,ibinkbs=YYY,lustrereadkbs=ZZZ,lustrewritekbs=....
or
2. Use the health_check script to report the status as a string, and just
parse that output.
The latter option is by far the simplest to implement, but are there any
pitfalls to this approach? Any limitations to the length of the string?
We use the health_check already to notify maui of any problems by printing a
string starting with "ERROR:", but one could use this for normal reporting
too.
Any other ways of achieving the same thing that I'm not aware of?
Currently we're using ganglia to collect and report all the stats we need, but
it is kind of flaky and a lot of stats are missing at irregular intervals.
torque seems to have a much more solid communication layer.
Regards,
r.
--
The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
phone:+47 77 64 41 07, fax:+47 77 64 41 00
Roy Dragseth, Team Leader, High Performance Computing
Direct call: +47 77 64 62 56. email: roy.dragseth at uit.no
More information about the torqueusers
mailing list