[torqueusers] MPI jobs not tied to nodes/ppn configuration

rozelak at volny.cz rozelak at volny.cz
Thu Oct 15 12:40:22 MDT 2009


Hallo,

I have access to heterogeneous clusters with many multi-core/processor
nodes, where PBSPro is installed. When I want to start MPI job, I need
to specify how many nodes, and how many CPUs per node I want. E.g.,
when I require 32 MPI processes, I need to run it as:

qsub -l nodes=16:ppn=2 ...

The problem is, that PBS will wait until there are at lest 16 nodes,
each with 2 cores free, even if there are more that 32 cores free (e.g.
15 nodes with 2 free cores each + 2 and more nodes with one free core,
giving 32+ free cores available). This can be found for any nodes/ppn
combination, e.g.:

qsub -l nodes=32:ppn=1 ...

will not be started on 31 nodes with 4+ free cores (having 124 cores
free!). What I need is just to say -- I need XY cores/processors for
my job in a cluster, and I do not care how many nodes it will be started
on, while each node may allocate different number of cores.


So, the question is: is 'torque' able to handle such cases? And how?
If so, I will talk about it with our clusters admins, as I remember
that they considered the migration from PBS and they are opened to our(users)
wishes.

Thank you very much for your answer,
Dan T.





More information about the torqueusers mailing list