[Mauiusers] Bug in memory limit enforcement after maui restart

Martin Kleinschmidt mk at theochem.uni-duesseldorf.de
Thu Dec 20 07:41:48 MST 2007


there seems to be a bug in the memory limit enforcement procedure. I
have been testing for a while now, why sometimes jobs die when
restarting maui.

my example job was submitted (via torque) with
-l nodes=1:4
-l mem=5000mb

it runs without problem, but when restarting maui it is killed and by
setting the loglevel to 255 I finally found:

12/20 15:31:06 INFO:     job 3369 exceeds requested memory limit (3658 >
1250)
12/20 15:31:06 MSysRegEvent(JOBRESVIOLATION:  job '3369' in state
'Running' has exceeded MEM resource limit (3658 > 1250) (action CANCEL
will be taken)  job start time: Thu Dec 20 15:29:34
,0,0,1)

so the total memory usage is roported to be 3658 out of 5000 mb (which
agrees with what it is really using) , but this value is then compared
to 1250 which is the limit per task (5000/4=1250).
This leads to a cencellation of the job.

The maui version is maui-3.2.6p19

   ...martin


our maui.cfg:


SERVERHOST            suzi.theochem.uni-duesseldorf.de
ADMIN1                root

RMCFG[SUZI.THEOCHEM.UNI-DUESSELDORF.DE] TYPE=PBS

AMCFG[bank]  TYPE=NONE

RMPOLLINTERVAL        00:00:30

SERVERPORT            42559
SERVERMODE            NORMAL

LOGFILE               maui.log
LOGFILEMAXSIZE        100000000
LOGLEVEL              1

QUEUETIMEWEIGHT       1


BACKFILLPOLICY        BESTFIT
RESERVATIONPOLICY     CURRENTHIGHEST
BACKFILLMETRIC        PE


NODEALLOCATIONPOLICY  MINRESOURCE

SRNAME[0]       SRBIG
SRHOSTLIST[0]    ^node[1-3]$
SRUSERLIST[0]   cm mk susan
SRPERIOD[0]     INFINITY

ENFORCERESOURCELIMITS ON
RESOURCELIMITPOLICY[0] MEM:ALWAYS:CANCEL

ENABLEMULTIREQJOBS TRUE

USERCFG[timo] MAXPE=07
USERCFG[stefan] MAXPE=07
USERCFG[mihajlo] MAXPE=07
USERCFG[lasse] MAXPE=07
USERCFG[mk]  MAXPE=60 MAXPROC=20
USERCFG[cm] MAXPE=60  MAXPROC=20
USERCFG[susan] MAXPE=60  MAXPROC=20

CLASSCFG[fast] QDEF=unlimit
CLASSCFG[medium] QDEF=batch
CLASSCFG[long] QDEF=batch
CLASSCFG[verylong] QDEF=batch


QOSCFG[batch] MAXPROC=74

QOSCFG[unlimit] OMAXPE=200
QOSCFG[unlimit] OMAXPROC=200



More information about the mauiusers mailing list