[Mauiusers] Interactive job preemption
Corey Ferrier
coreyf at CLEMSON.EDU
Mon Jul 14 08:00:55 MDT 2008
On Sun, Jul 13, 2008 at 08:47:14PM -0700, James A. Peltier wrote:
>On Fri, 11 Jul 2008, Corey Ferrier wrote:
>
>>
>>PREMPTOR is a typo in your QOSCFG[interactive] definition?
>>
>>Perhaps put the PRIORITY on the QOSCFG line?
>>
>> QOSCFG[interactive] QFLAGS=PREEMPTOR PRIORITY=10000
>> QOSCFG[batch] QFLAGS=PREEMPTEE PRIORITY=1
>>
>>- Corey
>>
>
>I corrected the spelling mistake, but still no go.
>
>
>Full maui.cfg
>
>SERVERHOST queen.fas.sfu.ca
>
># primary admin must be first in list
>ADMIN1 root
>ADMIN3 ALL
>
># Resource Manager Definition
>
>RMCFG[BASE] TYPE=PBS
>AMCFG[bank] TYPE=NONE
>
># full parameter docs at
>http://supercluster.org/mauidocs/a.fparameters.html
># use the 'schedctl -l' command to display current configuration
>JOBAGGREGATIONTIME 00:00:04
>RMPOLLINTERVAL 00:00:30
>
>SERVERPORT 42559
>SERVERMODE NORMAL
>
># Admin: http://supercluster.org/mauidocs/a.esecurity.html
>
>LOGFILE maui.log
>LOGFILEMAXSIZE 10000000
>LOGLEVEL 3
>
># Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
>
>QUEUETIMEWEIGHT 1
>
>QOSWEIGHT 1
>QOSCFG[interactive] QFLAGS=PREEMPTOR PRIORITY=100000
>QOSCFG[batch] QFLAGS=PREEMPTEE PRIORITY=1
>
>CLASSWEIGHT 1
>CLASSCFG[interactive] QDEF=interactive
>CLASSCFG[batch] QDEF=batch
>
># FairShare: http://supercluster.org/mauidocs/6.3fairshare.html
>
>FSPOLICY DEDICATEDPS
>FSDEPTH 2
>FSINTERVAL 24:00:00
>FSQOSWEIGHT 2
>FSDECAY 0.80
>
># Purge job information. Keep for 28 days
>JOBPURGETIME 28:00:00:00
>
># Throttling Policies:
>http://supercluster.org/mauidocs/6.2throttlingpolicies.html
>
># NONE SPECIFIED
>
># Backfill: http://supercluster.org/mauidocs/8.2backfill.html
>
>BACKFILLPOLICY FIRSTFIT
>RESERVATIONPOLICY CURRENTHIGHEST
>
># Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
>
>NODEALLOCATIONPOLICY MINRESOURCE
>
># Allow users to specify multiple requirements for jobs
># resource specifications such as '-l nodes=3:fast+1:io'
>ENABLEMULTIREQJOBS TRUE
>
># Job Preepmtion
># specifies how preemptible jobs will be preempted
># available options are REQUEUE, SUSPEND, CHECKPOINT
>PREEMPTPOLICY SUSPEND
>
># How should maui handle jobs that utilize more resoureces
># than they requested.
>RESOURCELIMITPOLICY MEM:EXTENDEDVIOLATION:CANCEL
>
># Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html
>
>USERCFG[DEFAULT] FSTARGET=25.0
>GROUPCFG[DEFAULT] FSTARGET=25.0
>
>#SRCFG[interactive] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI
>#SRCFG[interactive] STARTTIME=7:00:00 ENDTIME=19:00:00
>#SRCFG[interactive] CLASSLIST=interactive HOSTLIST=ilhpc01,ilhpc02
>#SRCFG[interactive] RESOURCES=PROCS:4,MEM:8g TASKCOUNT=8
>
>
>qsub -I -l nodes=linear-b,ncpus=8 -q interactive
>
>linear-b is a node I am using for testing. It's got a job running on it
>that should be preempted by this job since I specified the interactive
>queue. It is a 8 core x 16GB node.
>
>[user at queen ~]$ sudo qstat -f 11042
>Job Id: 11042.queen.fas.sfu.ca
> Job_Name = STDIN
> Job_Owner = user at queen.fas.sfu.ca
> job_state = Q
> queue = interactive
> server = queen.fas.sfu.ca
> Checkpoint = u
> ctime = Sun Jul 13 20:35:43 2008
> Error_Path = queen.fas.sfu.ca:/home/fas/user/STDIN.e11042
> Hold_Types = n
> interactive = True
> Join_Path = n
> Keep_Files = n
> Mail_Points = a
> mtime = Sun Jul 13 20:35:43 2008
> Output_Path = queen.fas.sfu.ca:/home/fas/user/STDIN.o11042
> Priority = 0
> qtime = Sun Jul 13 20:35:43 2008
> Rerunable = False
> Resource_List.ncpus = 8
> Resource_List.neednodes = linear-b
> Resource_List.nodect = 1
> Resource_List.nodes = linear-b
> substate = 10
> Variable_List = PBS_O_HOME=/home/fas/user,PBS_O_LANG=en_US.UTF-8,
> PBS_O_LOGNAME=user,
>
>PBS_O_PATH=/usr/local-linux/bin:/usr/lib/qt-3.3/bin:/usr/kerberos/bin
> :/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/bin,
> PBS_O_MAIL=/var/spool/mail/user,PBS_O_SHELL=/bin/tcsh,
> PBS_SERVER=queen.fas.sfu.ca,PBS_O_HOST=queen.fas.sfu.ca,
> PBS_O_WORKDIR=/home/fas/user,PBS_O_QUEUE=interactive
> euser = user
> egroup = wheel
> queue_rank = 10440
> queue_type = E
> etime = Sun Jul 13 20:35:43 2008
> submit_args = -I -l nodes=linear-b,ncpus=8 -q interactive
>
>[user at queen ~]$ tracejob -v 11042
>
>Job: 11042.queen.fas.sfu.ca
>
>07/13/2008 20:35:43 S enqueuing into interactive, state 1 hop 1
>07/13/2008 20:35:43 S Job Queued at request of
>user at queen.fas.sfu.ca, owner = user at queen.fas.sfu.ca, job name =
>STDIN, queue = interactive
>
>
>Any more ideas?
>
Your setup is quite similar to mine.
I have turned on credweight in the maui.cfg:
CREDWEIGHT 1
I have also emphasized qosweight:
QOSWEIGHT 10
Perhaps that might help?
The maui 'showconfig' command will display some
of the settings you have currently active.
# showconfig | grep -i weight
The maui 'checkjob' command will give you a bit
more detail on what maui has currently set for
that job's priority, etc.
# checkjob 11042
- Corey
--
Corey Ferrier coreyf at clemson.edu
HPC Group, CCIT, Clemson University 864-656-2790
340 Computer Court, Anderson, SC, USA 29625
More information about the mauiusers
mailing list