[torqueusers] queue routing based on mem resource not working properly...

Lech Nieroda lnieroda at gmail.com
Wed Jul 28 06:17:29 MDT 2010


Dear list,

we have a cluster with 3 groups of machines - some have 24GB, some
have 48GB, another group has 96GB, our maui version is 3.2.6p21.
The general idea is to keep the larger nodes free for jobs that
actually need that much RAM and thus route jobs with >48GB
automatically to the 96GB nodes, >24GB to the 48GB nodes and the rest
to the 24GB nodes.

How this was implemented:
- each node has been given a "property" according to its available
memory, i.e. ram96gb, ram48gb, ram24gb
- there is a queue for each memory size, with appropriate "neednodes"
and "resources_min.mem" statements, i.e.
   set queue qram48g resources_default.neednodes = ram48gb
   set queue qram48g resources_min.mem = 24gb
- finally, there is a routing queue, which routes the jobs, i.e.
    set queue default queue_type = Route
    set queue default route_destinations = qram96gb
    set queue default route_destinations += qram48gb
    set queue default route_destinations += qram24gb

However, this isn't working properly - since the jobs are routed
according to their total mem requirement and not the per node value,
for example a job with "-l nodes=2:ppn=4,mem=50gb" would require 25gb
per node but it is routed to qram48gb since 50gb>48gb. Supplying more
resource limits in the queue setup, like pmem and pvmem doesn't change
this behavior - the jobs are still routed to the larger nodes even
though smaller ones would suffice.

Any ideas, experiences with such routing?

Regards,
Lech


More information about the torqueusers mailing list