[torqueusers] Requeued jobs: accounting records?

Bill Wichser bill at Princeton.EDU
Tue Apr 27 08:31:47 MDT 2010


Glen Beane wrote:
> 
> 
> On Tue, Apr 27, 2010 at 8:41 AM, Bill Wichser <bill at princeton.edu 
> <mailto:bill at princeton.edu>> wrote:
> 
>     Well, looks like I'll answer my own question with what we have done here
>     in case anyone else cares.
> 
>     In ./server/req_jobobit.c, before the accounting info is cleared, we
>     have added the code to output an "E" record in the accounting file just
>     as if the job had ended.  The accounting in the job data structure then
>     gets cleared and requeued just as in the stock code.  Since we do
>     accounting on the key jobid-startdate, this allows us to add up all the
>     info for each invocation of the job but more importantly, allows us to
>     account for what appeared to be many lost cycles.
> 
> 
> 
> could you submit this change as a patch to bugzilla?  
> www.clusterresources.com/bugzilla <http://www.clusterresources.com/bugzilla>

I can, once we have tested further. It goes onto the production machine 
on Monday, or this week if I'm brave enough!

I do use Ole Nielsen's PBS accounting scripts, modified a bit, for other 
  monthly accounting.  I believe these may break with this method as 
they deal with these "E" records only.  So the side effect may be that 
some mods will have to be done for anyone using accounting this way.

Probably the better method would be to add an entirely new record, or 
simply reuse the "R" records appending this info onto them.  This way 
I'd be confident that things like tracejob wouldn't fail.

But I'll keep it on the list of things which need to be done and would 
gladly contribute any changes, preferring to do so such that we don't 
have our own version anyway!

Bill



More information about the torqueusers mailing list