[torqueusers] Torque 2.5.13 does not give resources_used in Accounting strings anymore?

Martin Siegert siegert at sfu.ca
Tue Oct 22 15:27:35 MDT 2013


On Tue, Oct 22, 2013 at 11:06:41PM +0200, Burkhard Bunk wrote:
> Hi,
> 
> with your findings in mind, I checked my installations. I didn't use
> accounting so far, but scanning through the accounting files, I can 
> confirm your observation.
> 
> My installations used 2.5.11 until July 2013, when I pulled 2.5.13 from
> git and rebuilt my packages. After the update, the accounting records
> don't contain "resources_used" clauses anymore.
> 
> My distribution is Debian 7 by now (32 and 64 bit), but an older server
> is still on Debian 6 (32 bit), all with the same symptoms.
> 
> Regards,
> Burkhard Bunk.
> ----------------------------------------------------------------------
>   bunk at physik.hu-berlin.de      Physics Institute, Humboldt University
>   fax:    ++49-30 2093 7628     Newtonstr. 15
>   phone:  ++49-30 2093 7980     12489 Berlin, Germany
> ----------------------------------------------------------------------
> 
> On Tue, 22 Oct 2013, Grigory Shamov wrote:
> 
> > Hi,
> >
> > For some reason , our Torque 2.5 stopped reporting the used resources in $SERVER_PRIV/accounting . It has now, for the finished jobs, something like this:
> >
> > 10/09/2013 23:59:53;E;YYYYYYY;user=XXX group=fazioja jobname=NAME_pseudo queue=default ctime=1381327807 qtime=1381327807 etime=1381327807 start=1381360466 owner=XXX at ZZZ exec_host=n181/11 Resource_List.mem=20gb Resource_List.opsys=RHEL6 Resource_List.pmem=256mb Resource_List.procs=1 Resource_List.walltime=80:00:00 session=30801 end=1381381193 Exit_status=0
> >
> > The only change I can recollect was updating from 2.5.12 to 2.5.13 to address the vulnerability and mom_segfaults issues. I have built it with exactly same configure parameters (but on different CentOS version, 6 instead of 5) as before.
> >
> > Before I have updated it, there were things like "resources_used.cput=00:05:40 resources_used.mem=232748kb resources_used.vmem=10462620kb resources_used.walltime=00:01:10" right after the Exit_status field. Now they disappeared.
> >
> > Did anything changed between 2.5.12 and 2.5.13 that could cause it? Or, is there a setting that I could trip accidentally, or something like that? Does anyone run Torque 2.5.13, if yes, do you have the complete accounting strings?

I suspect that the following change is responsible:

# diff -u torque-2.5.12/src/server/req_jobobit.c torque-2.5.13/src/server/req_jobobit.c
--- torque-2.5.12/src/server/req_jobobit.c      2011-10-05 16:20:11.000000000 -0700
+++ torque-2.5.13/src/server/req_jobobit.c      2013-08-01 09:10:01.000000000 -0700
@@ -2237,7 +2237,9 @@
   char   acctbuf[RESC_USED_BUF];
   int    accttail;
   int    exitstatus;
+#ifdef USESAVEDRESOURCES
   int    have_resc_used = FALSE;
+#endif
   char   mailbuf[RESC_USED_BUF];
   int    newstate;
   int    newsubst;
@@ -2399,10 +2401,10 @@
 
   accttail = strlen(acctbuf);
 
-  have_resc_used = get_used(patlist, acctbuf);
 
 #ifdef USESAVEDRESOURCES
 
+  have_resc_used = get_used(patlist, acctbuf);
   /* if we don't have resources from the obit, use what the job already had */
 
   if (!have_resc_used)

I am guessing that that the flag -DUSESAVEDRESOURCES is missing, but
necessary with torque-2.5.13.

Cheers,
Martin


More information about the torqueusers mailing list