[Mauiusers] maui not schedulling jobs in avaliable resources

Brian Christiansen bchristiansen at clusterresources.com
Tue Jun 2 15:21:03 MDT 2009


Roy and I were able to find the cause of this. Backfill was breaking on 
NODEACCESSPOLICY SINGLEJOB. The following patch has the fix.

http://www.clusterresources.com/download/maui/snapshots/maui-3.2.6p21-snap.1243977349.tar.gz

Thanks,
Brian

Roy Dragseth wrote:
> Hi, sorry for the late reply.
>
>
> My config is as follows, four compute nodes with np=2, two have the "gige" 
> feature and two have "ib". You'll find the config files below.
>
>
> Submit three jobs like this:
>
> echo sleep 1000 | qsub -lnodes=2:ppn=2:gige,walltime=3000
>
> wait for the first job to start, the two others should be queued.  
>
> Then submit four jobs like this
>
> echo sleep 1000 | qsub -lnodes=1,walltime=3000
>
> What you should see then is three jobs starting and one job ending up queued 
> leaving one job slot un-utilized:
>
>
> ACTIVE JOBS--------------------
> JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME
>
> 189                    royd    Running     4    00:43:43  Sat Mar 28 23:47:52
> 192                    royd    Running     1    00:45:21  Sat Mar 28 23:49:30
> 193                    royd    Running     1    00:45:52  Sat Mar 28 23:50:01
> 194                    royd    Running     1    00:45:52  Sat Mar 28 23:50:01
>
>      4 Active Jobs       7 of    8 Processors Active (87.50%)
>                          4 of    4 Nodes Active      (100.00%)
>
> IDLE JOBS----------------------
> JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME
>
> 190                    royd       Idle     4    00:50:00  Sat Mar 28 23:47:52
> 191                    royd       Idle     4    00:50:00  Sat Mar 28 23:47:53
> 195                    royd       Idle     1    00:50:00  Sat Mar 28 23:49:31
>
> 3 Idle Jobs
>
>
> This illustrates the behaviour we see on our production cluster without the 
> maui patch I submitted earlier.
>
>
> Here is the nodes file from my 4 node test cluster:
>
> compute-0-0 np=2 ib
> compute-0-1 np=2 ib
> compute-0-2 np=2 gige
> compute-0-3 np=2 gige
>
> and here is the maui.cfg
>
> RMPOLLINTERVAL          00:00:30
> JOBAGGREGATIONTIME      00:00:30
>
> SERVERHOST              hpc2.cc.uit.no
> SERVERPORT              42559
> SERVERMODE              NORMAL
> RMCFG[base]             TYPE=PBS
> ADMIN1                maui root
>
> LOGFILE               maui.log
> LOGFILEMAXSIZE        10000000
> LOGLEVEL              3
>
> BACKFILLPOLICY        FIRSTFIT
> RESERVATIONPOLICY     CURRENTHIGHEST
> NODEACCESSPOLICY SINGLEUSER
>
>
> And here is the torque config, aka the output from qmgr -c " print server"
>
> $ qmgr -c "p s"
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue default
> #
> create queue default
> set queue default queue_type = Execution
> set queue default enabled = True
> set queue default started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_host_enable = False
> set server acl_hosts = hpc2.cc.uit.no
> set server managers = maui at hpc2.cc.uit.no
> set server managers += root at hpc2.cc.uit.no
> set server default_queue = default
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server next_job_number = 196
>
>
>
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
>   



More information about the mauiusers mailing list