[Mauiusers] FW: priority queue / suspend job

=?ISO-8859-8-I?B?4if46Q==?= jerry.mersel at weizmann.ac.il
Mon Apr 14 02:13:59 MDT 2008


Hi:

  BTW to those who read my description of the problem
  if 2 jobs were running on node4 all would work fine.

                       Regards,
                         Jerry



-----Original Message-----
>From =?ISO-8859-8-I?B?4if46Q==?= <mlmersel at mail.weizmann.ac.il>
Sent Sun 4/13/2008 9:49 PM
To mauiusers at supercluster.org
Subject priority queue / suspend job

Hi:

  I am have set up 2 queues. A "normal" queue and a high
  priority queue.
  Of the 3 machines I am experimenting on all 3 can receive
  jobs from the normal queue and 2 can receive jobs from the
  high priority queue. If there are not enough free cpus a
  job from the normal queue should be suspended.  Everything 
  works fine and dandy when I'm working with 1 node, but when I 
  get into multiple nodes:ppn things don't work so well.

  For example (workq is normal q, prio.q is high priority)
  The high priority nodes have the property Jerry.
 
  pbsnodes give:

  node1
     state = free
     np = 2
     properties = Jerry
     ntype = cluster
     status = opsys=linux,uname=Linux node1 2.6.9-55.ELsmp #1 SMP Fri Apr 20 16:36:54 EDT 2007 x86_64,sessions=6714           6734,nsessions=2,nusers=1,idletime=286409,totmem=5767200kb,availmem=5638060kb,physmem=3735592kb,ncpus=2,loadave=0.00          ,netload=1216146266,state=free,jobs=? 0,rectime=1208111732

node3
     state = free
     np = 4
     ntype = cluster
     jobs = 2/144.node4
     status = opsys=linux,uname=Linux node3 2.6.9-55.ELsmp #1 SMP Fri Apr 20 16:36:54 EDT 2007 x86_64,sessions=3756           5071 27814 27834 27854 27874 28339,nsessions=7,nusers=3,idletime=289593,totmem=5825352kb,availmem=5645936kb,physmem=          12182352kb,ncpus=4,loadave=5.00,netload=694744571,state=free,jobs=144.node4,rectime=1208111732

node4
     state = free
     np = 2
     properties = Jerry
     ntype = cluster
     jobs = 0/169.node4
     status = opsys=linux,uname=Linux node4 2.6.9-55.ELsmp #1 SMP Fri Apr 20 16:36:54 EDT 2007 x86_64,sessions=498 2          269 3785 4900 29262 29285,nsessions=6,nusers=3,idletime=36361,totmem=5767048kb,availmem=4722596kb,physmem=3735440kb,     ncpus=2,loadave=1.00,netload=2458734191,state=free,jobs=169.node4,rectime=1208111729

When I give this command:

qsub -q prio.q -l nodes=2:ppn=2 ./t.sh

I expect the 1 job on node4 to get suspended so the high priority job can run on node1, and node4 using 2 cpus on eaach
machine but instead the new job just sits on the queue.

Here is my maui configuration file:
#
# MAUI configuration example
# @(#)maui.cfg David Groep 20031015.1
# for MAUI version 3.2.5
#
SERVERHOST              node4
ADMIN1                  root
ADMINHOST               node4
#JOBNODEMATCHPOLICY      EXACTNODE
PREEMPTPOLICY SUSPEND
#RESERVATIONPOLICY    NEVER
ENABLEMULTINODEJOBS  TRUE

#
RMTYPE[0]           PBS
RMHOST[0]           node4
RMSERVER[0]         node4

SERVERPORT            40559
SERVERMODE            NORMAL

# Set PBS server polling interval. Since we have many short jobs
# and want fast turn-around, set this to 10 seconds (default: 2 minutes)
RMPOLLINTERVAL        00:00:10

# a max. 10 MByte log file in a logical location
LOGFILE               /var/log/maui.log
LOGFILEMAXSIZE        10000000
LOGLEVEL              3

#NODECFG[node4]   PARTITION=Jerry
#NODECFG[node1]   PARTITION=Jerry

CLASSCFG[DEFAULT]  QDEF=low
CLASSCFG[prio.q]   QDEF=high
QOSCFG[high]  PRIORITY=50000 QFLAGS=PREEMPTOR
QOSCFG[DEFAULT] QFLAGS=PREEMPTEE
QOSWEIGHT     1


I appreciate any advice anyone can give.


                            Thanks,
                              Jerry

  
  



More information about the mauiusers mailing list