[torquedev] Latest snapshot won't start - "unable to read tracksize from tracking file"

Garrick Staples garrick at usc.edu
Fri Jan 13 00:03:23 MST 2006


On Thu, Jan 12, 2006 at 11:27:41PM -0700, Dave Jackson alleged:
> Chris,
> 
>   On one of our lab clusters, we checked and while the tracking file
> exists, it has been empty for sometime (perhaps always?).  I think this
> file is optional but a new check was exiting if it did not contain a set
> number of records.
> 
>   The new snap will check a bark a message to the log but will continue
> to execute even if there are tracking file failures

Ah, I see, it is inflating the total tracksize up to PBS_TRACK_MINSIZE
(which is 100.)  So a short read is quite reasonable in that case.  And
since size the struct array was set up with calloc(), there is no
concern of uninitialized data.

But it is still silly.  It should only attempt to read the proper amount
of data.  I'll fix it up tomorrow.

My test and production servers all have that file existing at the proper
size.  I had figured it was always created, but I see now in req_track.c
that it is only written when a job is routed.

Chriss, sorry for the inconvience, but I'm glad you are testing snaps :)

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20060112/2a542af7/attachment.bin


More information about the torquedev mailing list