[torquedev] Latest snapshot won't start - "unable to
read tracksize from tracking file"
garrick at usc.edu
Fri Jan 13 00:03:23 MST 2006
On Thu, Jan 12, 2006 at 11:27:41PM -0700, Dave Jackson alleged:
> On one of our lab clusters, we checked and while the tracking file
> exists, it has been empty for sometime (perhaps always?). I think this
> file is optional but a new check was exiting if it did not contain a set
> number of records.
> The new snap will check a bark a message to the log but will continue
> to execute even if there are tracking file failures
Ah, I see, it is inflating the total tracksize up to PBS_TRACK_MINSIZE
(which is 100.) So a short read is quite reasonable in that case. And
since size the struct array was set up with calloc(), there is no
concern of uninitialized data.
But it is still silly. It should only attempt to read the proper amount
of data. I'll fix it up tomorrow.
My test and production servers all have that file existing at the proper
size. I had figured it was always created, but I see now in req_track.c
that it is only written when a job is routed.
Chriss, sorry for the inconvience, but I'm glad you are testing snaps :)
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20060112/2a542af7/attachment.bin
More information about the torquedev