[torqueusers] scheduling jobs with normal and accelerated footprint

Toon Huysmans toon.huysmans at uantwerpen.be
Thu Mar 6 07:10:08 MST 2014

Dear torque users,

I am currently investigating the best approach to job scheduling with 
accelerators. One particular issue that I am confronted with is the relatively 
small number of accelerators (2xGPU) versus a large number of cores (32) that 
are available in a typical accelerated cluster node. To make efficient use of 
all available resources it is necessary to schedule part of the workload in 
accelerated mode and an other part using only CPU resources. In the ideal 
situation, where I have both accelerated and CPU-only versions of the tasks, 
it should be the scheduler that decides whether or not to use acceleration for 
a specific task, based on the available resources and the other jobs in the 
queue. Unfortunately, this kind of automated selection is not possible with 
current workload managers/schedulers. I came up with an approach that tries to 
simulate such behavior to some extent:

For each GPU in a cluster node, instantiate a separate pbs_mom that is 
assigned only a single core and a single GPU of the system and also attach a 
node feature 'GPU' to it. All the remaining CPU cores are assigned to the main 
pbs_mom instance and of course that instance does not have the 'GPU' node 
feature. Now, submit the job with a node preference for the 'GPU'feature. When 
the job starts, check whether the node has the GPU feature: if so, then the 
accelerated version is executed otherwise, the scheduler could not find a free 
node with this feature and the CPU-only version is executed. 

Although this approach should work, it has some significant drawbacks:

- you can only specify a single wall time, while the accelerated version has a 
wall time that is much shorter compared to the CPU-only version. Scheduling 
will therefore be suboptimal.

- it is not flexible towards the allocation of different numbers of cores and 
GPUs. Once a job is allocated to a GPU node, it completely consumes all 
resources on that node as there is no accounting for individual GPU cards, but 
only for the 'GPU' feature of the node.

Does anyone have a better solution for what I would like to achieve? Many 

Kind regards,

Toon Huysmans.

Postdoctoral Researcher
Tel: +32 (0) 3 265 24 72
Fax: +32 (0) 3 265 22 45
Email: toon.huysmans at uantwerpen.be
Web: visielab.ua.ac.be
Postal address:
iMinds-Vision Lab, Department of Physics
University of Antwerp (CDE)
Universiteitsplein 1 (N1.19)
B-2610 Antwerp, Belgium

More information about the torqueusers mailing list