[Mauiusers] Strange queue/scheduler issue

James A. Peltier jpeltier at sfu.ca
Mon Nov 7 18:26:16 MST 2011


----- Original Message -----
| Not sure if this is the correct forum for this but
| We have 320 core Grid with Maui & torque running. Three queues are
| setup
| up with two nodes (24 core) for one of them and another with two
| exclusively and three nodes sharing the default queue.
| When someone submit say 4000 jobs to the default queue. No one can
| submit
| any jobs to either of the other queue. They just sit in Q status. This
| started about three days ago and the users and total in an uproar
| about
| it.
| 
| Any thought would on where to find the bottle neck of a config setting
| would be helpful.
| 
| -I


I think you are looking for this.

http://www.adaptivecomputing.com/resources/docs/maui/a.ddevelopment.php

Specifically...

Value  : MMAX_JOB
File   : moab.h
Default: 4096

maximum total number of simultaneous idle/active jobs allowed.

NOTE: on some releases of Maui, MAX_MJOB may also need to be set and synchronized with MMAX_JOB.

You need to recompile Maui in order for it to be able to evaluate more than 4096 jobs.  This needs to be tweaked on larger clusters.

Change that to something like 32768 if you have a really large cluster.  Keep in mind that this slows scheduler job eligibility evaluations due to increased job count.


-- 
James A. Peltier
IT Services - Research Computing Group
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : jpeltier at sfu.ca
Website : http://www.sfu.ca/itservices
          http://blogs.sfu.ca/people/jpeltier
I will do the best I can with the talent I have



More information about the mauiusers mailing list