[torqueusers] 'A' accounting record marker?

Kenneth Yoshimoto kenneth at sdsc.edu
Tue Aug 23 17:37:58 MDT 2011


Thanks for all the info.

Kenneth

On Tue, 23 Aug 2011, Ken Nielson wrote:

> Date: Tue, 23 Aug 2011 15:46:31 -0600 (MDT)
> From: Ken Nielson <knielson at adaptivecomputing.com>
> Reply-To: Torque Users Mailing List <torqueusers at supercluster.org>
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Subject: Re: [torqueusers] 'A' accounting record marker?
> 
> ----- Original Message -----
>> From: "Ken Nielson" <knielson at adaptivecomputing.com>
>> To: "Torque Users Mailing List" <torqueusers at supercluster.org>
>> Sent: Tuesday, August 23, 2011 3:23:36 PM
>> Subject: Re: [torqueusers] 'A' accounting record marker?
>> It looks like we have a documentation deficiency. I have created an
>> internal ticket to fix the doc to include the other accounting
>> abbreviations.
>>
>> Ken
>>
>> ----- Original Message -----
>>> From: "Christopher Samuel" <samuel at unimelb.edu.au>
>>> To: torqueusers at supercluster.org
>>> Sent: Monday, August 22, 2011 9:07:40 PM
>>> Subject: Re: [torqueusers] 'A' accounting record marker?
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> On 23/08/11 04:40, Kenneth Yoshimoto wrote:
>>>
>>>>   I am seeing record marker 'A' in a Torque accounting log.
>>>> How should that be interpreted? I don't see it described here:
>>>> http://www.clusterresources.com/torquedocs21/9.1accounting.shtml
>>>
>>> My guess is that it's defined in src/include/acct.h:
>>>
>>> #define PBS_ACCT_ABT (int)'A' /* Job Abort by server */
>>>
>>> It's logged in job_abt() in src/server/job_func.c. The
>>> comment for that function in trunk in SVN says:
>>>
>>> /*
>>> * job_abt - abort a job
>>> *
>>> * The job removed from the system and a mail message is sent
>>> * to the job owner.
>>> */
>>>
>>> /* NOTE: this routine is called under the following conditions:
>>> * 1) by req_deletejob whenever deleting a job that is not running,
>>> * not transitting, not exiting and does not have a checkpoint
>>> * file on the mom.
>>> * 2) by req_deletearray whenever deleting a job that is not running,
>>> * not transitting, not in prerun, not exiting and does not have a
>>> * checkpoint file on the mom.
>>> * 3) by close_quejob when the server fails to enqueue the job.
>>> * 4) by array_delete_wt for prerun jobs that hang around too long
>>> and
>>> * do not have a checkpoint file on the mom.
>>> * 5) by pbsd_init when recovering jobs.
>>> * 6) by svr_movejob when done routing jobs around.
>>> * 7) by queue_route when trying toroute any "ready" jobs in a
>>> specific
>>> queue.
>>> * 8) by req_shutdown when trying to shutdown.
>>> * 9) by req_register when the request oparation is
>>> JOB_DEPEND_OP_DELETE.
>>> */
>>>
>>>
>
> There are three other Accounting markers that are not in the doc as well.
>
> #define PBS_ACCT_RERUN (int)'R' /* Job Rerun record */
> #define PBS_ACCT_CHKPNT (int)'C' /* Job Checkpointed and held */
> #define PBS_ACCT_RESTRT (int)'T' /* Job resTart (from chkpnt) record */
>
> Regards
>
> Ken
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list