[torqueusers] 'A' accounting record marker?

Ken Nielson knielson at adaptivecomputing.com
Tue Aug 23 15:46:31 MDT 2011


----- Original Message -----
> From: "Ken Nielson" <knielson at adaptivecomputing.com>
> To: "Torque Users Mailing List" <torqueusers at supercluster.org>
> Sent: Tuesday, August 23, 2011 3:23:36 PM
> Subject: Re: [torqueusers] 'A' accounting record marker?
> It looks like we have a documentation deficiency. I have created an
> internal ticket to fix the doc to include the other accounting
> abbreviations.
> 
> Ken
> 
> ----- Original Message -----
> > From: "Christopher Samuel" <samuel at unimelb.edu.au>
> > To: torqueusers at supercluster.org
> > Sent: Monday, August 22, 2011 9:07:40 PM
> > Subject: Re: [torqueusers] 'A' accounting record marker?
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > On 23/08/11 04:40, Kenneth Yoshimoto wrote:
> >
> > >   I am seeing record marker 'A' in a Torque accounting log.
> > > How should that be interpreted? I don't see it described here:
> > > http://www.clusterresources.com/torquedocs21/9.1accounting.shtml
> >
> > My guess is that it's defined in src/include/acct.h:
> >
> > #define PBS_ACCT_ABT (int)'A' /* Job Abort by server */
> >
> > It's logged in job_abt() in src/server/job_func.c. The
> > comment for that function in trunk in SVN says:
> >
> > /*
> > * job_abt - abort a job
> > *
> > * The job removed from the system and a mail message is sent
> > * to the job owner.
> > */
> >
> > /* NOTE: this routine is called under the following conditions:
> > * 1) by req_deletejob whenever deleting a job that is not running,
> > * not transitting, not exiting and does not have a checkpoint
> > * file on the mom.
> > * 2) by req_deletearray whenever deleting a job that is not running,
> > * not transitting, not in prerun, not exiting and does not have a
> > * checkpoint file on the mom.
> > * 3) by close_quejob when the server fails to enqueue the job.
> > * 4) by array_delete_wt for prerun jobs that hang around too long
> > and
> > * do not have a checkpoint file on the mom.
> > * 5) by pbsd_init when recovering jobs.
> > * 6) by svr_movejob when done routing jobs around.
> > * 7) by queue_route when trying toroute any "ready" jobs in a
> > specific
> > queue.
> > * 8) by req_shutdown when trying to shutdown.
> > * 9) by req_register when the request oparation is
> > JOB_DEPEND_OP_DELETE.
> > */
> >
> >

There are three other Accounting markers that are not in the doc as well.

#define PBS_ACCT_RERUN (int)'R' /* Job Rerun record */
#define PBS_ACCT_CHKPNT (int)'C' /* Job Checkpointed and held */
#define PBS_ACCT_RESTRT (int)'T' /* Job resTart (from chkpnt) record */

Regards

Ken


More information about the torqueusers mailing list