[Mauiusers] maui limits? looking for experience

Michel Béland michel.beland at rqchp.qc.ca
Wed Sep 28 09:15:50 MDT 2011


Hi,

> we've been using torque/maui for a long time. Our initial cluster was
> about 50 nodes and now ~350 with 3k processors.
> 
> It has been working fine since last cluster upgrade, when we added
> last 500 processors. Since then, maui client commands hang and we had
> to increase poll interval cause scheduling cycle took too much... Now,
> with a system with 3k running jobs and 3k in queue, we're facing more
> maui issues...
> 
> So, we were wondering which are maui limits, if we have reached any of
> them and if anyone who already reached our limits could share his
> experience, on solving them, with us.
> 
> we're running maui-3.3-1.x86_64.

I would advise defining a limit on idle jobs per user. For example:

USERCFG[DEFAULT] MAXIJOB=200

or any suitable number for you site.

Alternatively, Torque has a per-queue max_user_queuable setting, but it 
counts both running and queued jobs. If you use a route queue to route 
your job to an execution queue, you can define this for the execution 
queue and jobs will be moved to the execution queue only when the limit 
is respected.

Both solutions should decrease the load on Maui as it does not need to 
schedule as many jobs at a time.

-- 
Michel Béland, analyste en calcul scientifique
michel.beland at calculquebec.ca
bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
téléphone : 514 343-6111 poste 3892     télécopieur : 514 343-2155
Calcul Québec (www.calculquebec.ca)
Calcul Canada (calculcanada.org)


More information about the mauiusers mailing list