[Mauiusers] Re: maui & old gcc optimizer bug.
garrick at usc.edu
Sat Nov 19 14:01:07 MST 2005
On Wed, Nov 16, 2005 at 10:10:02AM -0500, Chris Johnson alleged:
> One other thing, probably related, maui keeps crashing and the
> last line in the log is
> ERROR: cannot get node info: Unknown Job Id
This implicates a bug in TORQUE we think we fixed in 2.0.0p0 (grab the
latest 2.0.0 snapshot from Nov 11). We were never able to reliably
reproduce it, but I haven't seen it happen since. The problem was that
a timeout wasn't properly detected and left stale data in a buffer; the
stale data was then incorrectly retrieved on the next stat.
In the lines prior to your error message, is there a long gap in time?
That shows that a jobstat() timed out, followed by a nodestat() that
returned the previous jobstat information.
You'd need to install the 2.0.0 pbs_server and client libs, and rebuild
maui with the new client libs.
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20051119/89e7d972/attachment.bin
More information about the mauiusers