[Mauiusers] maui limits? looking for experience
Michel Béland
michel.beland at rqchp.qc.ca
Wed Sep 28 09:15:50 MDT 2011
Hi,
> we've been using torque/maui for a long time. Our initial cluster was
> about 50 nodes and now ~350 with 3k processors.
>
> It has been working fine since last cluster upgrade, when we added
> last 500 processors. Since then, maui client commands hang and we had
> to increase poll interval cause scheduling cycle took too much... Now,
> with a system with 3k running jobs and 3k in queue, we're facing more
> maui issues...
>
> So, we were wondering which are maui limits, if we have reached any of
> them and if anyone who already reached our limits could share his
> experience, on solving them, with us.
>
> we're running maui-3.3-1.x86_64.
I would advise defining a limit on idle jobs per user. For example:
USERCFG[DEFAULT] MAXIJOB=200
or any suitable number for you site.
Alternatively, Torque has a per-queue max_user_queuable setting, but it
counts both running and queued jobs. If you use a route queue to route
your job to an execution queue, you can define this for the execution
queue and jobs will be moved to the execution queue only when the limit
is respected.
Both solutions should decrease the load on Maui as it does not need to
schedule as many jobs at a time.
--
Michel Béland, analyste en calcul scientifique
michel.beland at calculquebec.ca
bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
téléphone : 514 343-6111 poste 3892 télécopieur : 514 343-2155
Calcul Québec (www.calculquebec.ca)
Calcul Canada (calculcanada.org)
More information about the mauiusers
mailing list