[torqueusers] queue routing based on mem resource not working properly...

Garrick Staples garrick at usc.edu
Wed Jul 28 11:48:33 MDT 2010


On Wed, Jul 28, 2010 at 02:17:29PM +0200, Lech Nieroda alleged:
> Dear list,
> 
> we have a cluster with 3 groups of machines - some have 24GB, some
> have 48GB, another group has 96GB, our maui version is 3.2.6p21.
> The general idea is to keep the larger nodes free for jobs that
> actually need that much RAM and thus route jobs with >48GB
> automatically to the 96GB nodes, >24GB to the 48GB nodes and the rest
> to the 24GB nodes.
> 
> How this was implemented:
> - each node has been given a "property" according to its available
> memory, i.e. ram96gb, ram48gb, ram24gb
> - there is a queue for each memory size, with appropriate "neednodes"
> and "resources_min.mem" statements, i.e.
>    set queue qram48g resources_default.neednodes = ram48gb
>    set queue qram48g resources_min.mem = 24gb
> - finally, there is a routing queue, which routes the jobs, i.e.
>     set queue default queue_type = Route
>     set queue default route_destinations = qram96gb
>     set queue default route_destinations += qram48gb
>     set queue default route_destinations += qram24gb
> 
> However, this isn't working properly - since the jobs are routed
> according to their total mem requirement and not the per node value,
> for example a job with "-l nodes=2:ppn=4,mem=50gb" would require 25gb
> per node but it is routed to qram48gb since 50gb>48gb. Supplying more
> resource limits in the queue setup, like pmem and pvmem doesn't change
> this behavior - the jobs are still routed to the larger nodes even
> though smaller ones would suffice.
> 
> Any ideas, experiences with such routing?

I think you want to order your queues the other way around using resources_max.mem.

Are you using Maui? You could just use the MINRESOURCE node allocation policy.
Or just order the nodes in your server_priv/nodes file the way you want them
allocated.

I use the LASTAVAILABLE policy and order my largest nodes at the top of the
list.


-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Life is Good!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100728/6ef5a093/attachment.bin 


More information about the torqueusers mailing list