[torqueusers] Re: pbs_mom caches last healthcheck script error ?
(Re: [Moabusers] Moab keeps on trying after pbs_mom rejects.)
csamuel at vpac.org
Mon Dec 4 16:38:46 MST 2006
On Tuesday 05 December 2006 10:28, Garrick Staples wrote:
> Then your health check script is returning the error.
But it isn't, that's why we're bemused - we can run the script until we're
blue in the face and it doesn't return anything at all!
The other nodes in the cluster are all fine, and they're running the same
Hmm, hang on a tic..
ARGH! %^!&#r^(*#%^ Fedora.
The two "special" nodes that have this problem are running FC6 (for hardware
reasons), the rest are running FC5.
If you run the script as root then you get the above (fine) response.
If you run the script as a normal user you get a message about it not being
able to find lspci, and so the script was generating the message when the
grep for the characteristic that said the card was in 64-bit mode wasn't
For some reason this is only happening on the FC6 nodes, no idea why..
Brett's fixed his script to have full paths to the commands and they've come
back online quite happily!
Open mouth, remove foot..
Sorry about this Garrick!
Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20061205/4508fc09/attachment.bin
More information about the torqueusers