[torqueusers] Requeued jobs: accounting records?
Bill Wichser
bill at Princeton.EDU
Tue Apr 27 06:41:56 MDT 2010
Well, looks like I'll answer my own question with what we have done here
in case anyone else cares.
In ./server/req_jobobit.c, before the accounting info is cleared, we
have added the code to output an "E" record in the accounting file just
as if the job had ended. The accounting in the job data structure then
gets cleared and requeued just as in the stock code. Since we do
accounting on the key jobid-startdate, this allows us to add up all the
info for each invocation of the job but more importantly, allows us to
account for what appeared to be many lost cycles.
Bill
Bill Wichser wrote:
> Although the instantaneous usage of our cluster shows 90-100%
> utilization at any point in time, when processing accounting records
> from Torque, we calculate usage around only 50%.
>
> The reason this is occurring is because we are actively using preemption
> on a number of classes. When a job gets preempted, which happens
> frequently, all accounting information is lost when the job is requeued.
> If the user has requested not to be requeued, then an E record is
> generated and the system usage is recorded.
>
> It would be beneficial to us to have an accounting record generated when
> a job is requeued. This could serve the purpose of both finding the
> "cost" of preemption as well as allow us to account for total system
> usage as we see from the instantaneous usage.
>
> So my question first is has someone already added this feature? If so,
> can this be rolled into a release? And if not, I guess that we're on
> our own here and will investigate adding this ourselves.
>
> Thanks,
> Bill
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list