[torqueusers] Routing MPI jobs based on the nb of requested cores per node

Raphael Leplae raphael.leplae at ulb.ac.be
Thu Apr 25 03:22:17 MDT 2013

Dear all,

We have an heterogeneous cluster with a set of old nodes on Infiniband 
DDR and another set of newer nodes on Infiniband QDR with no IB link 
between the two sets of nodes.
The two sets of nodes have been "defined" with the properties 'ddr' and 

Since I am using a routing queue, I would like to redirect MPI jobs 
asking 4 or less cores per node on the DDR old nodes and those asking > 
4 cores per node on the QDR newer nodes.
Additional constrain: maximum 10 nodes per MPI job on DDR nodes.

A queue mpi_ddr could be created as follow:
create queue mpi_ddr
set queue mpi_ddr queue_type = Execution
set queue mpi_ddr resources_max.nodect = 10 -> 10 nodes constrain
set queue mpi_ddr resources_min.nodect = 2 -> MPI jobs require at least 
2 nodes
set queue mpi_ddr resources_default.mem = 2gb
set queue mpi_ddr resources_default.neednodes = ddr -> target the old nodes
set queue mpi_ddr enabled = True
set queue mpi_ddr started = True

This queue will come first in the routing queue list:
set queue submission route_destinations = mpi_ddr
set queue submission route_destinations += mpi_qdr

Here are some points/questions:

1) Reading the documentation in 
it is unclear if the resource limitation:

set queue mpi_ddr resources_max.ncpus = 4

will be applied on a per node basis or on the total number of cores 
requested by the MPI job. Since the other resource procct exists, I 
assumed a per node constrain.

2) I tested this setup but it doesn't work:

qsub -l nodes=4:ppn=6

will place the job in the mpi_ddr queue despite the resources_max.ncpus = 4!
Question: what is the queue attribute limit to apply to force MPI jobs 
asking 4 or less cores to be placed in the mpi_ddr queue?

3) I tested also the following limitation (10 nodes * 4 cores):

set queue mpi_ddr resources_max.procct = 40

although asking 5 nodes with 8 cores would be accepted as well.
But I noticed that this limitation doesn't show up in qmgr when printing 
the server config ('print server' command).
A bug I guess.

4) If is not possible to perform this type of routing in Torque, someone 
has a proposal to implement it in Moab?
In this case, there would be a single queue for MPI jobs in Torque but 
nodes allocation would be done in Moab.

We use Torque 4.1.4 and Moab 6.0.3.




Raphael Leplae, Ph.D.
Operations manager             Tel: +32 2 650 3727
Computing Center               Fax: +32 2 650 3740
Avenue A. Buyl, 91 - CP 197
1050 Brussels

More information about the torqueusers mailing list