[Mauiusers] Question on maui.cfg NODEALLOCATION policy
Brad Mecklenburg
bmecklenburg at colsa.com
Thu Feb 8 07:48:10 MST 2007
I have one cluster, 256 nodes but the hardware is split up between the
nodes, 128 nodes each. Part A and Part B have different number of
processors, processor speed, memory... It is set up to specify which half
of the cluster to run on in the pbs submit script based on the attributes
set in server_priv/nodes file.
I'm having a little trouble with the NODEALLOCATION policy, I think. When I
set it to MINRESOURCE, Part A of the cluster can be run on but this half has
4 cpus, so when the user tries to submit say 32 node jobs, the same nodes
are being run on and all 4 cpus are used which is not what I want. Set up
this way, part B of the cluster is able to have jobs run on it immediately
and complete fine. So I changed the NODEALLOCATION POLICY to CPULOAD. Part
of A of the cluster behaves how I want it. If the user submits two 32 node
jobs with 2 processors, then the jobs are split up but I have a problem
running on part B of the cluster. when I try to submit a job to this side of
the cluster, the job stays in the QUEUED state from qstat. The maul log
will say something like 256 feasible tasks found (running a 128 node ppn=2
job) but the next line will say something like "inadequate tasks found for
job whatever 2< 256" However, if I qrun the job number, the job will run
fine and complete successfully.
I guess I have a couple of questions. The problem I am seeing with the job
staying in the queue. Why does it stay in the queue because the
NODEALLOCATION policy is set to CPULOAD. There is only one job running on
this half of the cluster. Should I set the NODEALLOCATION policy to another
setting for this to work properly?
I will put below my maui.cfg so if anyone sees anything that would be
causing this type of behavior, please let me know. I have not made many
changes to the maui.cfg. It is pretty much just a basic setup. Any info
would be appreciated. Thanks.
# maui.cfg 3.2.6p18
SERVERHOST marvin
# primary admin must be first in list
ADMIN1 root brad jbennett cpd
# Resource Manager Definition
RMCFG[marvin] TYPE=PBS
RMCFG[marvin] TIMEOUT=30
JOBAGGREGATIONTIME 00:00:10
# Allocation Manager Definition
AMCFG[bank] TYPE=NONE
# full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html
# use the 'schedctl -l' command to display current configuration
RMPOLLINTERVAL 00:01:00
SERVERPORT 42559
SERVERMODE NORMAL
# Admin: http://supercluster.org/mauidocs/a.esecurity.html
LOGFILE maui.log
LOGFILEMAXSIZE 10000000
LOGLEVEL 3
# Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
QUEUETIMEWEIGHT 1
# FairShare: http://supercluster.org/mauidocs/6.3fairshare.html
#FSPOLICY PSDEDICATED
#FSDEPTH 7
#FSINTERVAL 86400
#FSDECAY 0.80
# Throttling Policies:
http://supercluster.org/mauidocs/6.2throttlingpolicies.html
# NONE SPECIFIED
# Backfill: http://supercluster.org/mauidocs/8.2backfill.html
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
# Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
NODEALLOCATIONPOLICY CPULOAD
# QOS: http://supercluster.org/mauidocs/7.3qos.html
# QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB
# QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE
# Standing Reservations:
http://supercluster.org/mauidocs/7.1.3standingreservations.html
# SRSTARTTIME[test] 8:00:00
# SRENDTIME[test] 17:00:00
# SRDAYS[test] MON TUE WED THU FRI
# SRTASKCOUNT[test] 20
# SRMAXTIME[test] 0:30:00
# Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html
# USERCFG[DEFAULT] FSTARGET=25.0
# USERCFG[john] PRIORITY=100 FSTARGET=10.0-
# GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi
# CLASSCFG[batch] FLAGS=PREEMPTEE
# CLASSCFG[interactive] FLAGS=PREEMPTOR
## Additions made to the maui config file upon build. Brad
NODEAVAILABILITYPOLICY DEDICATED:SWAP
JOBNODEMATCHPOLICY EXACTNODE
NODEACCESSPOLICY SHARED
NODEMAXLOAD 3.5
DEFERTIME 0
0
LOGDIR /var/spool/maui/log
LOGFILEROLLDEPTH 10
STATDIR /var/spool/maui/stats
###test to help with running on otis nodes:
ENABLEMULTIREQJOBS TRUE
--
Brad Mecklenburg
More information about the mauiusers
mailing list