[torqueusers] Warewulf NHC 1.2.2 Release

Michael Jennings mej at lbl.gov
Tue Jan 22 18:23:53 MST 2013


Due primarily to a couple pretty important bug fixes, I've gone ahead
and released version 1.2.2 of Warewulf NHC.  I'm hoping to get at
least one more release after this one out the door before MoabCon
2013.  :-)

Warewulf Node Health Check, for anyone not familiar with it, is an
effort to create a framework and implementation for the node health
check scripts often used by resource managers and schedulers as well
as for periodic independent node sanity checks.  More information and
complete documentation may be found at:

http://warewulf.lbl.gov/trac/wiki/Node%20Health%20Check

If you want to skip right to the packages, here's the direct link to
the tarball.  RPMs for RHEL5 and RHEL6, as well as yum repositories
for each, are available as well.

http://warewulf.lbl.gov/downloads/releases/warewulf-nhc/warewulf-nhc-1.2.2.tar.gz
http://warewulf.lbl.gov/downloads/repo/rhel5/
http://warewulf.lbl.gov/downloads/repo/rhel6/


CHANGES:

 - The watchdog timer is more reliable now and should work with all
   supported versions of bash.
 - The signal handlers were cleaned up and a couple minor issues
   fixed.
 - NEW CHECK:  check_hw_mcelog() was implemented as requested and
   discussed on the torqueusers mailing list.
 - NEW FEATURE:  pdsh-style node range support has been added thanks
   to a patch from John Hanks <john.hanks at usu.edu>.  You can now
   specify one or more comma-separated node range expressions for
   matching nodes.  The whole thing must be surrounded by braces to
   avoid conflicts with globbing syntax.  A couple examples:
     {node[00-99]}     || check_hw_cpuinfo 2 12 24
     {n0,n3,n[5-8]}    || check_hw_cpuinfo 2 8 8
     {n[0-123].cchem}  || check_ps_daemon sshd root
   Note that you can't have commas *inside* the brackets, nor can you
   have more than one bracketed range per subexpression.
 - NEW FEATURE:  LDAP/NIS/NIS+/etc. support.  If direct passwd file
   lookup for a userid or UID fails, NHC will now fall back to using
   "getent" to do the resolution.  Parsing sources other than
   /etc/passwd is also supported.  See the online docs for more
   details.

Please note that the 2 new features listed above are EXPERIMENTAL and
have not been thoroughly tested.  Any issues should be reported via
the mailing list or warewulf.lbl.gov bug tracker.

Any/all feedback welcome!  Enjoy!  :-)

Michael

-- 
Michael Jennings <mej at lbl.gov>
Senior HPC Systems Engineer
High-Performance Computing Services
Lawrence Berkeley National Laboratory
Bldg 50B-3209E        W: 510-495-2687
MS 050B-3209          F: 510-486-8615


More information about the torqueusers mailing list