[torqueusers] Limit size of standard output / standard error
Chad Vizino
vizino at psc.edu
Mon Dec 15 09:12:13 MST 2008
Hi David,
David Singleton has addressed this at his site in OpenPBS (see post
below). We've modified it for Torque use at our site and it works
great. From src/resmom/linux/mom_mach.c:
Put this with other externs:
extern char *path_spool;
From mom_over_limit():
#if NO_SPOOL_OUTPUT == 0
#define VARSPOOLUSERLIM_KB 20480
/* check file sizes in PBS spool area */
if (pjob->ji_qs.ji_svrflags&JOB_SVFLG_HERE) { /* only on MS */
char path[64];
char *suf;
struct stat sbuf;
(void)strcpy(path, path_spool);
(void)strcat(path, pjob->ji_qs.ji_fileprefix);
suf = path+strlen(path);
(void)strcat(path, JOB_STDOUT_SUFFIX);
if ( (stat(path, &sbuf)==0) &&
(sbuf.st_size>>10 > (off_t)VARSPOOLUSERLIM_KB) ){
sprintf(log_buffer, "stdout file size %luKB exceeded limit %luKB",
((unsigned long)(sbuf.st_size>>10)), (unsigned
long)VARSPOOLUSERLIM_KB);
return (TRUE);
}
(void)strcpy(suf, JOB_STDERR_SUFFIX);
if ( (stat(path, &sbuf)==0) &&
(sbuf.st_size>>10 > (off_t)VARSPOOLUSERLIM_KB) ){
sprintf(log_buffer, "stderr file size %luKB exceeded limit %luKB",
((unsigned long)(sbuf.st_size>>10)), (unsigned
long)VARSPOOLUSERLIM_KB);
return (TRUE);
}
}
#endif
Regards,
-Chad
Chad Vizino
Pittsburgh Supercomputing Center
> Subject: Re: [torqueusers] User's job can mess up the system so thatno jobs run
> Date: Fri, 07 Sep 2007 21:44:14 +1000
> From: David Singleton <David.Singleton at anu.edu.au>
> Reply-To: David.Singleton at anu.edu.au
> Organization: ANUSF
> To: Atwood, Robert C <r.atwood at imperial.ac.uk>
> CC: torqueusers at supercluster.org
>
>
> This is a hacky bit of code we have at the end of mom_over_limit()
> in our PBS - it kills jobs when spooled stdout or stderr reach 20MB
> (who will ever read 20MB of text!). It would need modifying for
> Torque.
>
> David
>
> /* This should be a mom config option */
> #define CHECKVAR
>
> #if !defined(NO_SPOOL_OUTPUT) && defined(CHECKVAR)
> #define VARSPOOLUSERLIM_KB 20480
>
> /* check file sizes in PBS spool area */
> if (pjob->ji_qs.ji_svrflags&JOB_SVFLG_HERE) { // only on MS
> char path[64];
> char *suf;
> struct stat sbuf;
>
> (void)strcpy(path, path_spool);
> (void)strcat(path, pjob->ji_qs.ji_fileprefix);
> suf = path+strlen(path);
>
> (void)strcat(path, JOB_STDOUT_SUFFIX);
> if ( (stat(path, &sbuf)==0) &&
> (sbuf.st_size>>10 > (off_t)VARSPOOLUSERLIM_KB) ){
> sprintf(log_buffer, "stdout file size %luKB exceeds limit %luKB",
> ((unsigned long)(sbuf.st_size>>10)), (unsigned long)VARSPOOLUSERLIM_KB);
> return (JOB_SVFLG_OVERLMT2|JOB_SVFLG_OVERLMTFILE);
> }
>
> (void)strcpy(suf, JOB_STDERR_SUFFIX);
> if ( (stat(path, &sbuf)==0) &&
> (sbuf.st_size>>10 > (off_t)VARSPOOLUSERLIM_KB) ){
> sprintf(log_buffer, "stderr file size %luKB exceeds limit %luKB",
> ((unsigned long)(sbuf.st_size>>10)), (unsigned long)VARSPOOLUSERLIM_KB);
> return (JOB_SVFLG_OVERLMT2|JOB_SVFLG_OVERLMTFILE);
> }
> }
> #endif
>
> ...
> --------------------------------------------------------------------------
> Dr David Singleton ANU Supercomputer Facility
> HPC Systems Manager and APAC National Facility
> David.Singleton at anu.edu.au Leonard Huxley Bldg (No. 56)
> Phone: +61 2 6125 4389 Australian National University
> Fax: +61 2 6125 8199 Canberra, ACT, 0200, Australia
> --------------------------------------------------------------------------
On 12/11/08 9:27 PM, David Schibeci wrote:
> This is probably a lame question, but I've done a google search and
> can't find an answer.
>
> Is there a way to get torque to limit the size of standard output/error?
> And kill the job if it exceeds this limit?
>
> We have diskless nodes, and if standard output/error gets too big, then
> the machine runs out of RAM.
>
> Thanks in advance,
> David
>
> ------------------------------------------------------------------------------
>
> David Schibeci
> Senior Systems Administrator
> iVEC Informatics Facility
> Centre for Comparative Genomics
> Murdoch University
> South Street
> Murdoch WA 6150
>
> Phone: 61 8 9360 2492
> Fax: 61 8 9360 7238
> E-Mail: dschibeci at ccg.murdoch.edu.au
More information about the torqueusers
mailing list