[torqueusers] tracejob output question
Steve Snelgrove
ssnelgrove at clusterresources.com
Thu Apr 10 08:55:18 MDT 2008
Chris Samuel wrote:
> Hi all,
>
> We've just been asked by one of our users:
>
>
>> Tracejob shows the memory (and virtural memory) used by
>> a program. Is that the peak memory used? Average memory
>> used? or what.
>>
>
> Now we know that tracejob grabs stuff out of the PBS logs,
> but the question is still are the numbers that are recorded
> there the:
>
> 1) Maximum usage
> 2) Average usage
> 3) Usage when job ended (or last measured)
> 4) Something else
>
> Any ideas ?
>
> cheers!
> Chris
>
I hope this is right answer... It seems like the way that the mom
reports info about a job is by reading the file "/proc/<job-id>/stat".
This file contains one line which is parsed in mom_mach.c with the
following.
/* see stat_str[] value for mapping 'stat' format */
if (sscanf(lastbracket,stat_str,
&ps.state, /* state (one of RSDZTW) */
&ps.ppid, /* ppid */
&ps.pgrp, /* pgrp */
&ps.session, /* session id */
&ps.flags, /* flags - kernel flags of the process, see the
PF_* in <linux/sched.h> */
&ps.utime, /* utime - jiffies that this process has been
scheduled in user mode */
&ps.stime, /* stime - jiffies that this process has been
scheduled in kernel mode */
&ps.cutime, /* cutime - jiffies that this process waited-for
children have been scheduled in user mode */
&ps.cstime, /* cstime - jiffies that this process waited-for
children have been scheduled in kernel mode */
&jstarttime, /* starttime */
&ps.vsize, /* vsize */
&ps.rss) != 12) /* rss */
{
This information is accumulated for all processes running on the
system. Since a job may have multiple processes associated with it, the
information saved in JOB_ATR_resc_used is a sum for all processes
matching the session ID.
So what is reported for mem is the sum of rss * page_size for all processes.
For vmem, it is the sum of vsize for all processes.
Hope this helps a little.
More information about the torqueusers
mailing list