Bug 100 - incorrect walltime accounting value
: incorrect walltime accounting value
Status: NEW
Product: TORQUE
pbs_server
: 2.4.x
: Other Linux
: P5 major
Assigned To: Glen
:
:
:
  Show dependency treegraph
 
Reported: 2010-11-16 01:42 MST by sobigworld
Modified: 2010-11-16 02:36 MST (History)
2 users (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description sobigworld 2010-11-16 01:42:46 MST
We are using torque 2.4.0b1 and maui 3.3 in out cluster (about 40 nodes), and
it seem to have found some strange behavior, that looks like a bug. 

One job accounting like this:
======================================
09/27/2010 00:30:50;E;4186.gnode1;user=ly group=users jobname=chiero-t0.0001
queue=comm_queue ctime=1285479778 qtime=1285479778 etime=1285479778
start=1285479780 owner=ly@login1.bwfs.cn
work_dir=/home3/ly/tuzhaopeng/research/wam_hpb/exp/final/pruning-threshold/decoder/t0.0001
exec_host=gnode23/3+gnode23/2+gnode23/1+gnode23/0
Resource_List.neednodes=1:ppn=4 Resource_List.nodect=1
Resource_List.nodes=1:ppn=4 Resource_List.pmem=997mb
Resource_List.walltime=168:00:00 session=16995 end=1285518650 Exit_status=271
resources_used.cput=09:16:23 resources_used.mem=2504360kb
resources_used.vmem=2793716kb resources_used.walltime=76571:17:37
======================================

the accounting field "resources_used.walltime" is incorrected obviously.

Is this a new bug for torque or  has it been corrected at newer versions ?

thanks!
Comment 1 sobigworld 2010-11-16 02:36:31 MST
server log info about job 4186 like this:

09/27/2010 00:15:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:16:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:17:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:18:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:18:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:19:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:20:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:21:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:21:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:22:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:23:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:24:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:24:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:25:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:26:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:27:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:27:47;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:28:32;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:29:17;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:30:02;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:30:47;0002;PBS_Server;Job;4186.gnode1;node 'gnode23' is allocated
to job but in state 'down'
09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;obit received - updating
final job usage info
09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;attr resources_used
modified
09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;job exit status 271 handled
09/27/2010 00:30:50;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job
4186.gnode1 state from RUNNING-RUNNING to EXITING-EXITING (5-50)
09/27/2010 00:30:50;000d;PBS_Server;Job;4186.gnode1;sending 'e' mail for job
4186.gnode1 to ly@login1.bwfs.cn (Exit_status=271
09/27/2010 00:30:50;0010;PBS_Server;Job;4186.gnode1;Exit_status=271
resources_used.cput=09:16:23 resources_used.mem=2504360kb
resources_used.vmem=2793716kb resources_used.walltime=76571:17:37
09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;on_job_exit task assigned
to job
09/27/2010 00:30:50;0009;PBS_Server;Job;4186.gnode1;req_jobobit completed
09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;JOB_SUBSTATE_EXITING
09/27/2010 00:30:50;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job
4186.gnode1 state from EXITING-EXITING to EXITING-RETURNSTD (5-70)
09/27/2010 00:30:50;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job
4186.gnode1 state from EXITING-RETURNSTD to EXITING-STAGEOUT (5-51)
09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;no spool files to return
09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;JOB_SUBSTATE_STAGEOUT
09/27/2010 00:30:50;0008;PBS_Server;Job;4186.gnode1;about to copy
stdout/stderr/stageout files
09/27/2010 00:30:51;0008;PBS_Server;Job;4186.gnode1;JOB_SUBSTATE_STAGEOUT
09/27/2010 00:30:51;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting job
4186.gnode1 state from EXITING-STAGEOUT to EXITING-STAGEDEL (5-52)