[torqueusers] Requeued jobs: accounting records?

Bill Wichser bill at Princeton.EDU
Tue Apr 27 06:41:56 MDT 2010


Well, looks like I'll answer my own question with what we have done here 
in case anyone else cares.

In ./server/req_jobobit.c, before the accounting info is cleared, we 
have added the code to output an "E" record in the accounting file just 
as if the job had ended.  The accounting in the job data structure then 
gets cleared and requeued just as in the stock code.  Since we do 
accounting on the key jobid-startdate, this allows us to add up all the 
info for each invocation of the job but more importantly, allows us to 
account for what appeared to be many lost cycles.

Bill

Bill Wichser wrote:
> Although the instantaneous usage of our cluster shows 90-100% 
> utilization at any point in time, when processing accounting records 
> from Torque, we calculate usage around only 50%.
> 
> The reason this is occurring is because we are actively using preemption 
> on a number of classes.  When a job gets preempted, which happens 
> frequently, all accounting information is lost when the job is requeued. 
>   If the user has requested not to be requeued, then an E record is 
> generated and the system usage is recorded.
> 
> It would be beneficial to us to have an accounting record generated when 
> a job is requeued.  This could serve the purpose of both finding the 
> "cost" of preemption as well as allow us to account for total system 
> usage as we see from the instantaneous usage.
> 
> So my question first is has someone already added this feature?  If so, 
> can this be rolled into a release?  And if not, I guess that we're on 
> our own here and will investigate adding this ourselves.
> 
> Thanks,
> Bill
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list