[Mauiusers] Interactive job preemption

Corey Ferrier coreyf at CLEMSON.EDU
Mon Jul 14 08:00:55 MDT 2008


On Sun, Jul 13, 2008 at 08:47:14PM -0700, James A. Peltier wrote:
>On Fri, 11 Jul 2008, Corey Ferrier wrote:
>
>>
>>PREMPTOR is a typo in your QOSCFG[interactive] definition?
>>
>>Perhaps put the PRIORITY on the QOSCFG line?
>>
>> QOSCFG[interactive] QFLAGS=PREEMPTOR  PRIORITY=10000
>> QOSCFG[batch]       QFLAGS=PREEMPTEE  PRIORITY=1
>>
>>- Corey
>>
>
>I corrected the spelling mistake, but still no go.
>
>
>Full maui.cfg
>
>SERVERHOST            queen.fas.sfu.ca
>
># primary admin must be first in list
>ADMIN1                root
>ADMIN3                ALL
>
># Resource Manager Definition
>
>RMCFG[BASE] TYPE=PBS
>AMCFG[bank]  TYPE=NONE
>
># full parameter docs at 
>http://supercluster.org/mauidocs/a.fparameters.html
># use the 'schedctl -l' command to display current configuration
>JOBAGGREGATIONTIME    00:00:04
>RMPOLLINTERVAL        00:00:30
>
>SERVERPORT            42559
>SERVERMODE            NORMAL
>
># Admin: http://supercluster.org/mauidocs/a.esecurity.html
>
>LOGFILE               maui.log
>LOGFILEMAXSIZE        10000000
>LOGLEVEL              3
>
># Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
>
>QUEUETIMEWEIGHT       1
>
>QOSWEIGHT 1
>QOSCFG[interactive] QFLAGS=PREEMPTOR PRIORITY=100000
>QOSCFG[batch] QFLAGS=PREEMPTEE PRIORITY=1
>
>CLASSWEIGHT 1
>CLASSCFG[interactive] QDEF=interactive
>CLASSCFG[batch] QDEF=batch
>
># FairShare: http://supercluster.org/mauidocs/6.3fairshare.html
>
>FSPOLICY              DEDICATEDPS
>FSDEPTH               2
>FSINTERVAL            24:00:00
>FSQOSWEIGHT           2
>FSDECAY               0.80
>
># Purge job information.  Keep for 28 days
>JOBPURGETIME 28:00:00:00
>
># Throttling Policies: 
>http://supercluster.org/mauidocs/6.2throttlingpolicies.html
>
># NONE SPECIFIED
>
># Backfill: http://supercluster.org/mauidocs/8.2backfill.html
>
>BACKFILLPOLICY        FIRSTFIT
>RESERVATIONPOLICY     CURRENTHIGHEST
>
># Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
>
>NODEALLOCATIONPOLICY  MINRESOURCE
>
># Allow users to specify multiple requirements for jobs
># resource specifications such as '-l nodes=3:fast+1:io'
>ENABLEMULTIREQJOBS   TRUE
>
># Job Preepmtion
># specifies how preemptible jobs will be preempted
># available options are REQUEUE, SUSPEND, CHECKPOINT
>PREEMPTPOLICY SUSPEND
>
># How should maui handle jobs that utilize more resoureces
># than they requested.
>RESOURCELIMITPOLICY MEM:EXTENDEDVIOLATION:CANCEL
>
># Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html
>
>USERCFG[DEFAULT]      FSTARGET=25.0
>GROUPCFG[DEFAULT]     FSTARGET=25.0
>
>#SRCFG[interactive] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI
>#SRCFG[interactive] STARTTIME=7:00:00 ENDTIME=19:00:00
>#SRCFG[interactive] CLASSLIST=interactive HOSTLIST=ilhpc01,ilhpc02
>#SRCFG[interactive] RESOURCES=PROCS:4,MEM:8g TASKCOUNT=8
>
>
>qsub -I -l nodes=linear-b,ncpus=8 -q interactive
>
>linear-b is a node I am using for testing.  It's got a job running on it 
>that should be preempted by this job since I specified the interactive 
>queue.  It is a 8 core x 16GB node.
>
>[user at queen ~]$ sudo qstat -f 11042
>Job Id: 11042.queen.fas.sfu.ca
>    Job_Name = STDIN
>    Job_Owner = user at queen.fas.sfu.ca
>    job_state = Q
>    queue = interactive
>    server = queen.fas.sfu.ca
>    Checkpoint = u
>    ctime = Sun Jul 13 20:35:43 2008
>    Error_Path = queen.fas.sfu.ca:/home/fas/user/STDIN.e11042
>    Hold_Types = n
>    interactive = True
>    Join_Path = n
>    Keep_Files = n
>    Mail_Points = a
>    mtime = Sun Jul 13 20:35:43 2008
>    Output_Path = queen.fas.sfu.ca:/home/fas/user/STDIN.o11042
>    Priority = 0
>    qtime = Sun Jul 13 20:35:43 2008
>    Rerunable = False
>    Resource_List.ncpus = 8
>    Resource_List.neednodes = linear-b
>    Resource_List.nodect = 1
>    Resource_List.nodes = linear-b
>    substate = 10
>    Variable_List = PBS_O_HOME=/home/fas/user,PBS_O_LANG=en_US.UTF-8,
>        PBS_O_LOGNAME=user,
>
>PBS_O_PATH=/usr/local-linux/bin:/usr/lib/qt-3.3/bin:/usr/kerberos/bin
>        :/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/bin,
>        PBS_O_MAIL=/var/spool/mail/user,PBS_O_SHELL=/bin/tcsh,
>        PBS_SERVER=queen.fas.sfu.ca,PBS_O_HOST=queen.fas.sfu.ca,
>        PBS_O_WORKDIR=/home/fas/user,PBS_O_QUEUE=interactive
>    euser = user
>    egroup = wheel
>    queue_rank = 10440
>    queue_type = E
>    etime = Sun Jul 13 20:35:43 2008
>    submit_args = -I -l nodes=linear-b,ncpus=8 -q interactive
>
>[user at queen ~]$ tracejob -v 11042
>
>Job: 11042.queen.fas.sfu.ca
>
>07/13/2008 20:35:43  S    enqueuing into interactive, state 1 hop 1
>07/13/2008 20:35:43  S    Job Queued at request of 
>user at queen.fas.sfu.ca, owner = user at queen.fas.sfu.ca, job name = 
>STDIN, queue = interactive
>
>
>Any more ideas?
>

Your setup is quite similar to mine.

I have turned on credweight in the maui.cfg:

  CREDWEIGHT 1

I have also emphasized qosweight:

  QOSWEIGHT  10

Perhaps that might help?

The maui 'showconfig' command will display some
of the settings you have currently active.
  
   # showconfig | grep -i weight

The maui 'checkjob' command will give you a bit
more detail on what maui has currently set for 
that job's priority, etc.

   # checkjob 11042

- Corey

-- 
Corey Ferrier                               coreyf at clemson.edu
HPC Group, CCIT, Clemson University               864-656-2790
340 Computer Court, Anderson, SC, USA 29625         


More information about the mauiusers mailing list