[Mauiusers] Re: maui & old gcc optimizer bug.

Garrick Staples garrick at usc.edu
Sat Nov 19 14:01:07 MST 2005


On Wed, Nov 16, 2005 at 10:10:02AM -0500, Chris Johnson alleged:
>      One other thing, probably related, maui keeps crashing and the
> last line in the log is
> 
> ERROR:    cannot get node info: Unknown Job Id

This implicates a bug in TORQUE we think we fixed in 2.0.0p0 (grab the
latest 2.0.0 snapshot from Nov 11).  We were never able to reliably
reproduce it, but I haven't seen it happen since.  The problem was that
a timeout wasn't properly detected and left stale data in a buffer; the
stale data was then incorrectly retrieved on the next stat.

In the lines prior to your error message, is there a long gap in time?
That shows that a jobstat() timed out, followed by a nodestat() that
returned the previous jobstat information.

You'd need to install the 2.0.0 pbs_server and client libs, and rebuild
maui with the new client libs.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20051119/89e7d972/attachment.bin


More information about the mauiusers mailing list