[Mauiusers] maui not schedulling jobs in avaliable resources
Brian Christiansen
bchristiansen at clusterresources.com
Tue Jun 2 15:21:03 MDT 2009
Roy and I were able to find the cause of this. Backfill was breaking on
NODEACCESSPOLICY SINGLEJOB. The following patch has the fix.
http://www.clusterresources.com/download/maui/snapshots/maui-3.2.6p21-snap.1243977349.tar.gz
Thanks,
Brian
Roy Dragseth wrote:
> Hi, sorry for the late reply.
>
>
> My config is as follows, four compute nodes with np=2, two have the "gige"
> feature and two have "ib". You'll find the config files below.
>
>
> Submit three jobs like this:
>
> echo sleep 1000 | qsub -lnodes=2:ppn=2:gige,walltime=3000
>
> wait for the first job to start, the two others should be queued.
>
> Then submit four jobs like this
>
> echo sleep 1000 | qsub -lnodes=1,walltime=3000
>
> What you should see then is three jobs starting and one job ending up queued
> leaving one job slot un-utilized:
>
>
> ACTIVE JOBS--------------------
> JOBNAME USERNAME STATE PROC REMAINING STARTTIME
>
> 189 royd Running 4 00:43:43 Sat Mar 28 23:47:52
> 192 royd Running 1 00:45:21 Sat Mar 28 23:49:30
> 193 royd Running 1 00:45:52 Sat Mar 28 23:50:01
> 194 royd Running 1 00:45:52 Sat Mar 28 23:50:01
>
> 4 Active Jobs 7 of 8 Processors Active (87.50%)
> 4 of 4 Nodes Active (100.00%)
>
> IDLE JOBS----------------------
> JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
>
> 190 royd Idle 4 00:50:00 Sat Mar 28 23:47:52
> 191 royd Idle 4 00:50:00 Sat Mar 28 23:47:53
> 195 royd Idle 1 00:50:00 Sat Mar 28 23:49:31
>
> 3 Idle Jobs
>
>
> This illustrates the behaviour we see on our production cluster without the
> maui patch I submitted earlier.
>
>
> Here is the nodes file from my 4 node test cluster:
>
> compute-0-0 np=2 ib
> compute-0-1 np=2 ib
> compute-0-2 np=2 gige
> compute-0-3 np=2 gige
>
> and here is the maui.cfg
>
> RMPOLLINTERVAL 00:00:30
> JOBAGGREGATIONTIME 00:00:30
>
> SERVERHOST hpc2.cc.uit.no
> SERVERPORT 42559
> SERVERMODE NORMAL
> RMCFG[base] TYPE=PBS
> ADMIN1 maui root
>
> LOGFILE maui.log
> LOGFILEMAXSIZE 10000000
> LOGLEVEL 3
>
> BACKFILLPOLICY FIRSTFIT
> RESERVATIONPOLICY CURRENTHIGHEST
> NODEACCESSPOLICY SINGLEUSER
>
>
> And here is the torque config, aka the output from qmgr -c " print server"
>
> $ qmgr -c "p s"
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue default
> #
> create queue default
> set queue default queue_type = Execution
> set queue default enabled = True
> set queue default started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_host_enable = False
> set server acl_hosts = hpc2.cc.uit.no
> set server managers = maui at hpc2.cc.uit.no
> set server managers += root at hpc2.cc.uit.no
> set server default_queue = default
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server next_job_number = 196
>
>
>
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
>
More information about the mauiusers
mailing list