[torquedev] [Bug 93] Resource management semantics of Torque need to be well defined
Michel Béland
michel.beland at rqchp.qc.ca
Mon Dec 6 16:38:38 MST 2010
>> The fact of the matter is that ppn hasn't been clearly defined over time, and
>> what it has become in practice is probably best described as processes per
>> node.
>
> Describing it as "processes per node" is very misleading and completely
> inaccurate. Take for example a multi-threaded program. I routinely run
> multi-threaded code on our cluster. We have 32 cores per node, and if I run a
> _single process_ that uses 32 threads, I request ppn=32. If that meant
> _processes_ I would request ppn=1 because, after all, my mult-threaded program
> is still a single process. It is, however, using multiple-cores.
>
> virtual processor per node is the correct definition of ppn - the number of
> virtual processors will typically be set to the total number of cores on a
> node. redefining it as processes per node will lead to problems.
The fact that Torque generates a $PBS_NODEFILE containing one line per
virtual processor seems to support the conclusion that ppn means virtual
processors. But I have always considered this behaviour broken since it
does not work well with programs that use MPI *and* OpenMP. Back in the
days when I used PBS Pro 5.3, I liked what they implemented to submit a
job like this. You could ask for example -lnodes=10:ppn=2:cpp=4 to get
10 nodes with 2 processes per node and 4 CPUs per process. Then PBS Pro
would generate a $PBS_NODEFILE containing all the nodes repeated twice
(because of ppn=2) and also set the environment variables NCPUS and
OMP_NUM_THREADS to 4. With this, it would really allocate 8 virtual
processors per node (ppn*cpp). With Torque, you have to fiddle with
$PBS_NODEFILE to make it work with hybrid parallel programs.
True enough, PBS Pro did not preclude running more processes than
requested, but ppn meant processes per node (at least in the restricted
MPI sense) and that is what the documentation said.
Later, they introduced -lselect and deprecated -lnodes altogether. Now
one can ask for -lselect=10:ncpus=8:mpiprocs=2:ompthread=4 to get the
same result, if I remember correctly, but I think that I liked ppn and
cpp better...
--
Michel Béland, analyste en calcul scientifique
michel.beland at rqchp.qc.ca
bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
téléphone : 514 343-6111 poste 3892 télécopieur : 514 343-2155
RQCHP (Réseau québécois de calcul de haute performance) www.rqchp.ca
Calcul Canada (computecanada.org)
More information about the torquedev
mailing list