Bugzilla – Bug 100
incorrect walltime accounting value
Last modified: 2010-11-16 02:36:31 MST
You need to log in before you can comment on or make changes to this bug.
We are using torque 2.4.0b1 and maui 3.3 in out cluster (about 40 nodes), and it seem to have found some strange behavior, that looks like a bug. One job accounting like this: ====================================== 09/27/2010 00:30:50;E;4186.gnode1;user=ly group=users jobname=chiero-t0.0001 queue=comm_queue ctime=1285479778 qtime=1285479778 etime=1285479778 start=1285479780 owner=ly@login1.bwfs.cn work_dir=/home3/ly/tuzhaopeng/research/wam_hpb/exp/final/pruning-threshold/decoder/t0.0001 exec_host=gnode23/3+gnode23/2+gnode23/1+gnode23/0 Resource_List.neednodes=1:ppn=4 Resource_List.nodect=1 Resource_List.nodes=1:ppn=4 Resource_List.pmem=997mb Resource_List.walltime=168:00:00 session=16995 end=1285518650 Exit_status=271 resources_used.cput=09:16:23 resources_used.mem=2504360kb resources_used.vmem=2793716kb resources_used.walltime=76571:17:37 ====================================== the accounting field "resources_used.walltime" is incorrected obviously. Is this a new bug for torque or has it been corrected at newer versions ? thanks!
server log info about job 4186 like this: 09/27/2010 00:15:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:16:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:17:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:18:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:18:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:19:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:20:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:21:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:21:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:22:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:23:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:24:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:24:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:25:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:26:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:27:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:27:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:28:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:29:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:30:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:30:47;0002;PBS_Server;Job;4186.gnode1;node 'gnode23' is allocated to job but in state 'down' 09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;obit received - updating final job usage info 09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;attr resources_used modified 09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;job exit status 271 handled 09/27/2010 00:30:50;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job 4186.gnode1 state from RUNNING-RUNNING to EXITING-EXITING (5-50) 09/27/2010 00:30:50;000d;PBS_Server;Job;4186.gnode1;sending 'e' mail for job 4186.gnode1 to ly@login1.bwfs.cn (Exit_status=271 09/27/2010 00:30:50;0010;PBS_Server;Job;4186.gnode1;Exit_status=271 resources_used.cput=09:16:23 resources_used.mem=2504360kb resources_used.vmem=2793716kb resources_used.walltime=76571:17:37 09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;on_job_exit task assigned to job 09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;req_jobobit completed 09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;JOB_SUBSTATE_EXITING 09/27/2010 00:30:50;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job 4186.gnode1 state from EXITING-EXITING to EXITING-RETURNSTD (5-70) 09/27/2010 00:30:50;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job 4186.gnode1 state from EXITING-RETURNSTD to EXITING-STAGEOUT (5-51) 09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;no spool files to return 09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;JOB_SUBSTATE_STAGEOUT 09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;about to copy stdout/stderr/stageout files 09/27/2010 00:30:51;0008;PBS_Server;Job;4186.gnode1;JOB_SUBSTATE_STAGEOUT 09/27/2010 00:30:51;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job 4186.gnode1 state from EXITING-STAGEOUT to EXITING-STAGEDEL (5-52)