[Mauiusers] cluster underused - single cpu jobs hold up parallel (fwd)

Lydia Heck lydia.heck at durham.ac.uk
Thu Mar 31 10:47:32 MDT 2011



I should of course have told you the configuration. I will list it below:
On Thu, 31 Mar 2011, Lydia Heck wrote:

>
>
> Our cluster has ~2,600 cores, there are parallel jobs running to fill ~1,700
> and there are many sequential jobs queue that are now in "front" of other 
> parallel jobs. But only one or two of the sequential jobs are running, the 
> others being held by the user.
>
> The parallel jobs are not schedulled. The scheduler is maui. Any idea what I 
> am missing here?
>
> Lydia
>
>
# Maui version 3.3 (PID: 24013)
# global policies

REJECTNEGPRIOJOBS[0]              FALSE
ENABLENEGJOBPRIORITY[0]           FALSE
ENABLEMULTINODEJOBS[0]            TRUE
ENABLEMULTIREQJOBS[0]             FALSE
BFPRIORITYPOLICY[0]               [NONE]
JOBPRIOACCRUALPOLICY            QUEUEPOLICY
NODELOADPOLICY                  ADJUSTSTATE
USEMACHINESPEED                 FALSE
USESYSTEMQUEUETIME              TRUE
USELOCALMACHINEPRIORITY         FALSE
NODEUNTRACKEDLOADFACTOR         1.2
JOBNODEMATCHPOLICY[0]

JOBMAXSTARTTIME[0]                  INFINITY

METAMAXTASKS[0]                   0
NODESETPOLICY[0]                  [NONE]
NODESETATTRIBUTE[0]               [NONE]
NODESETLIST[0]
NODESETDELAY[0]                   00:00:00
NODESETPRIORITYTYPE[0]            MINLOSS
NODESETTOLERANCE[0]                 0.00

BACKFILLPOLICY[0]                 FIRSTFIT
BACKFILLDEPTH[0]                  0
BACKFILLPROCFACTOR[0]             0
BACKFILLMAXSCHEDULES[0]           10000
BACKFILLMETRIC[0]                 PROCS

BFCHUNKDURATION[0]                00:00:00
BFCHUNKSIZE[0]                    0
PREEMPTPOLICY[0]                  REQUEUE
MINADMINSTIME[0]                  00:00:00
RESOURCELIMITPOLICY[0]
NODEAVAILABILITYPOLICY[0]         COMBINED:[DEFAULT]
NODEALLOCATIONPOLICY[0]           PRIORITY
TASKDISTRIBUTIONPOLICY[0]         DEFAULT
RESERVATIONPOLICY[0]              CURRENTHIGHEST
RESERVATIONRETRYTIME[0]           00:00:00
RESERVATIONTHRESHOLDTYPE[0]       NONE
RESERVATIONTHRESHOLDVALUE[0]      0

FSPOLICY                        [NONE]
FSPOLICY                        [NONE]
FSINTERVAL                      12:00:00
FSDEPTH                         8
FSDECAY                         1.00



# Priority Weights

SERVICEWEIGHT[0]                  1
TARGETWEIGHT[0]                   1
CREDWEIGHT[0]                     1
ATTRWEIGHT[0]                     1
FSWEIGHT[0]                       1
RESWEIGHT[0]                      1
USAGEWEIGHT[0]                    1
QUEUETIMEWEIGHT[0]                1
XFACTORWEIGHT[0]                  0
SPVIOLATIONWEIGHT[0]              0
BYPASSWEIGHT[0]                   0
TARGETQUEUETIMEWEIGHT[0]          0
TARGETXFACTORWEIGHT[0]            0
USERWEIGHT[0]                     0
GROUPWEIGHT[0]                    0
ACCOUNTWEIGHT[0]                  0
QOSWEIGHT[0]                      0
CLASSWEIGHT[0]                    0
FSUSERWEIGHT[0]                   0
FSGROUPWEIGHT[0]                  0
FSACCOUNTWEIGHT[0]                0
FSQOSWEIGHT[0]                    0
FSCLASSWEIGHT[0]                  0
ATTRATTRWEIGHT[0]                 0
ATTRSTATEWEIGHT[0]                0
NODEWEIGHT[0]                     0
PROCWEIGHT[0]                     0
MEMWEIGHT[0]                      0
SWAPWEIGHT[0]                     0
DISKWEIGHT[0]                     0
PSWEIGHT[0]                       0
PEWEIGHT[0]                       0
WALLTIMEWEIGHT[0]                 0
UPROCWEIGHT[0]                    0
UJOBWEIGHT[0]                     0
CONSUMEDWEIGHT[0]                 0
USAGEEXECUTIONTIMEWEIGHT[0]       0
REMAININGWEIGHT[0]                0
PERCENTWEIGHT[0]                  0
XFMINWCLIMIT[0]                   00:02:00


# partition DEFAULT policies

REJECTNEGPRIOJOBS[1]              FALSE
ENABLENEGJOBPRIORITY[1]           FALSE
ENABLEMULTINODEJOBS[1]            TRUE
ENABLEMULTIREQJOBS[1]             FALSE
BFPRIORITYPOLICY[1]               [NONE]
JOBPRIOACCRUALPOLICY            QUEUEPOLICY
NODELOADPOLICY                  ADJUSTSTATE
JOBNODEMATCHPOLICY[1]

JOBMAXSTARTTIME[1]                  INFINITY

METAMAXTASKS[1]                   0
NODESETPOLICY[1]                  [NONE]
NODESETATTRIBUTE[1]               [NONE]
NODESETLIST[1]
NODESETDELAY[1]                   00:00:00
NODESETPRIORITYTYPE[1]            MINLOSS
NODESETTOLERANCE[1]                 0.00

# Priority Weights

XFMINWCLIMIT[1]                   00:00:00

NODECFG[DEFAULT]                  PRIORITYF=JOBCOUNT
RMAUTHTYPE[0]                     CHECKSUM

CLASSCFG[[NONE]]  DEFAULT.FEATURES=[NONE]
CLASSCFG[[ALL]]  DEFAULT.FEATURES=[NONE]
CLASSCFG[ourdomain]  DEFAULT.FEATURES=[NONE]
QOSPRIORITY[0]                    0
QOSQTWEIGHT[0]                    0
QOSXFWEIGHT[0]                    0
QOSTARGETXF[0]                      0.00
QOSTARGETQT[0]                    00:00:00
QOSFLAGS[0]
QOSPRIORITY[1]                    0
QOSQTWEIGHT[1]                    0
QOSXFWEIGHT[1]                    0
QOSTARGETXF[1]                      0.00
QOSTARGETQT[1]                    00:00:00
QOSFLAGS[1]
# SERVER MODULES:  MX
SERVERMODE                      NORMAL
SERVERNAME
SERVERHOST                      server
SERVERPORT                      42559
LOGFILE                         maui.log
LOGFILEMAXSIZE                  10000000
LOGFILEROLLDEPTH                1
LOGLEVEL                        3
LOGFACILITY                     fALL
SERVERHOMEDIR                   /<ourspool-dir>/spool/
TOOLSDIR                        /<ourspool-dir>/spool/tools/
LOGDIR                          /<ourspool-dir>/spool/log/
STATDIR                         /<ourspool-dir>/spool/stats/
LOCKFILE                        /<ourspool-dir>/spool/maui.pid
SERVERCONFIGFILE                /<ourspool-dir>/spool/maui.cfg
CHECKPOINTFILE                  /<ourspool-dir>/spool/maui.ck
CHECKPOINTINTERVAL              00:05:00
CHECKPOINTEXPIRATIONTIME        3:11:20:00
TRAPJOB
TRAPNODE
TRAPFUNCTION
RESDEPTH                        24

RMPOLLINTERVAL                  00:00:30
NODEACCESSPOLICY                SHARED
ALLOCLOCALITYPOLICY             [NONE]
SIMTIMEPOLICY                   [NONE]
ADMIN1                          root
ADMINHOSTS                      ALL
NODEPOLLFREQUENCY               0
DISPLAYFLAGS
DEFAULTDOMAIN
DEFAULTCLASSLIST                [DEFAULT:1]
FEATURENODETYPEHEADER
FEATUREPROCSPEEDHEADER
FEATUREPARTITIONHEADER
DEFERTIME                       1:00:00
DEFERCOUNT                      24
DEFERSTARTCOUNT                 1
JOBPURGETIME                    0
NODEPURGETIME                   2140000000
APIFAILURETHRESHHOLD            6
NODESYNCTIME                    600
JOBSYNCTIME                     600
JOBMAXOVERRUN                   00:10:00
NODEMAXLOAD                     0.0

PLOTMINTIME                     120
PLOTMAXTIME                     245760
PLOTTIMESCALE                   11
PLOTMINPROC                     1
PLOTMAXPROC                     512
PLOTPROCSCALE                   9
SCHEDCFG[]                        MODE=NORMAL SERVER=server:42559
# RM MODULES: PBS SSS WIKI NATIVE
RMCFG[SERVER] AUTHTYPE=CHECKSUM EPORT=15004 TIMEOUT=00:00:09 TYPE=PBS
SIMWORKLOADTRACEFILE            workload
SIMRESOURCETRACEFILE            resource
SIMAUTOSHUTDOWN                 OFF
SIMSTARTTIME                    0
SIMSCALEJOBRUNTIME              FALSE
SIMFLAGS
SIMJOBSUBMISSIONPOLICY          CONSTANTJOBDEPTH
SIMINITIALQUEUEDEPTH            16
SIMWCACCURACY                   0.00
SIMWCACCURACYCHANGE             0.00
SIMNODECOUNT                    0
SIMNODECONFIGURATION            NORMAL
SIMWCSCALINGPERCENT             100
SIMCOMRATE                      0.10
SIMCOMTYPE                      ROUNDROBIN
COMINTRAFRAMECOST               0.30
COMINTERFRAMECOST               0.30
SIMSTOPITERATION                -1
SIMEXITITERATION                -1




More information about the mauiusers mailing list