[torqueusers] Warewulf NHC 1.2.2 Release
Matthew Britt
msbritt at umich.edu
Wed Jan 23 09:37:18 MST 2013
Thanks Michael. I've upgraded from 1.2.1 to 1.2.2 and tested on three different nodes, and the time to run went from an average of .45 seconds to 3.5-5 seconds (same config). How do you enable debug mode and does it give timing for each test being done so I can see where the time is going?
Thanks,
- Matt
--------------------------------------------
Matthew Britt
CAEN HPC Group - College of Engineering
msbritt at umich.edu
On Jan 22, 2013, at 8:23 PM, Michael Jennings <mej at lbl.gov> wrote:
> Due primarily to a couple pretty important bug fixes, I've gone ahead
> and released version 1.2.2 of Warewulf NHC. I'm hoping to get at
> least one more release after this one out the door before MoabCon
> 2013. :-)
>
> Warewulf Node Health Check, for anyone not familiar with it, is an
> effort to create a framework and implementation for the node health
> check scripts often used by resource managers and schedulers as well
> as for periodic independent node sanity checks. More information and
> complete documentation may be found at:
>
> http://warewulf.lbl.gov/trac/wiki/Node%20Health%20Check
>
> If you want to skip right to the packages, here's the direct link to
> the tarball. RPMs for RHEL5 and RHEL6, as well as yum repositories
> for each, are available as well.
>
> http://warewulf.lbl.gov/downloads/releases/warewulf-nhc/warewulf-nhc-1.2.2.tar.gz
> http://warewulf.lbl.gov/downloads/repo/rhel5/
> http://warewulf.lbl.gov/downloads/repo/rhel6/
>
>
> CHANGES:
>
> - The watchdog timer is more reliable now and should work with all
> supported versions of bash.
> - The signal handlers were cleaned up and a couple minor issues
> fixed.
> - NEW CHECK: check_hw_mcelog() was implemented as requested and
> discussed on the torqueusers mailing list.
> - NEW FEATURE: pdsh-style node range support has been added thanks
> to a patch from John Hanks <john.hanks at usu.edu>. You can now
> specify one or more comma-separated node range expressions for
> matching nodes. The whole thing must be surrounded by braces to
> avoid conflicts with globbing syntax. A couple examples:
> {node[00-99]} || check_hw_cpuinfo 2 12 24
> {n0,n3,n[5-8]} || check_hw_cpuinfo 2 8 8
> {n[0-123].cchem} || check_ps_daemon sshd root
> Note that you can't have commas *inside* the brackets, nor can you
> have more than one bracketed range per subexpression.
> - NEW FEATURE: LDAP/NIS/NIS+/etc. support. If direct passwd file
> lookup for a userid or UID fails, NHC will now fall back to using
> "getent" to do the resolution. Parsing sources other than
> /etc/passwd is also supported. See the online docs for more
> details.
>
> Please note that the 2 new features listed above are EXPERIMENTAL and
> have not been thoroughly tested. Any issues should be reported via
> the mailing list or warewulf.lbl.gov bug tracker.
>
> Any/all feedback welcome! Enjoy! :-)
>
> Michael
>
> --
> Michael Jennings <mej at lbl.gov>
> Senior HPC Systems Engineer
> High-Performance Computing Services
> Lawrence Berkeley National Laboratory
> Bldg 50B-3209E W: 510-495-2687
> MS 050B-3209 F: 510-486-8615
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list