[torqueusers] Adding more system stats.

Roy Dragseth roy.dragseth at cc.uit.no
Wed Oct 20 03:07:18 MDT 2010


I'm trying to create a simple (or as simple as possible) tool that add more 
system stats in the node report and is wondering what would be the best way to 
do this.

1. Add a new parameter in the mom config:
    systat=/path/to/script.sh
   and let pbsnodes report whatever output comes from script.sh as a string
   something like 
systat=iboutkbs=XXX,ibinkbs=YYY,lustrereadkbs=ZZZ,lustrewritekbs=....

or

2. Use the health_check script to report the status as a string, and just   
   parse that output.

The latter option is by far the simplest to implement, but are there any 
pitfalls to this approach?  Any limitations to the length of the string? 

We use the health_check already to notify maui of any problems by printing a 
string starting with "ERROR:", but one could use this for normal reporting 
too.

Any other ways of achieving the same thing that I'm not aware of?

Currently we're using ganglia to collect and report all the stats we need, but 
it is kind of flaky and a lot of stats are missing at irregular intervals.  
torque seems to have a much more solid communication layer.

Regards,
r.

-- 

  The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
	      phone:+47 77 64 41 07, fax:+47 77 64 41 00
        Roy Dragseth, Team Leader, High Performance Computing
	 Direct call: +47 77 64 62 56. email: roy.dragseth at uit.no


More information about the torqueusers mailing list