[torquedev] [Bug 93] Resource management semantics of Torque need to be well defined
siegert at sfu.ca
Tue Dec 7 11:08:51 MST 2010
On Mon, Dec 06, 2010 at 06:38:38PM -0500, Michel Béland wrote:
> >> The fact of the matter is that ppn hasn't been clearly defined over time, and
> >> what it has become in practice is probably best described as processes per
> >> node.
> > Describing it as "processes per node" is very misleading and completely
> > inaccurate. Take for example a multi-threaded program. I routinely run
> > multi-threaded code on our cluster. We have 32 cores per node, and if I run a
> > _single process_ that uses 32 threads, I request ppn=32. If that meant
> > _processes_ I would request ppn=1 because, after all, my mult-threaded program
> > is still a single process. It is, however, using multiple-cores.
> > virtual processor per node is the correct definition of ppn - the number of
> > virtual processors will typically be set to the total number of cores on a
> > node. redefining it as processes per node will lead to problems.
> The fact that Torque generates a $PBS_NODEFILE containing one line per
> virtual processor seems to support the conclusion that ppn means virtual
> processors. But I have always considered this behaviour broken since it
> does not work well with programs that use MPI *and* OpenMP. Back in the
> days when I used PBS Pro 5.3, I liked what they implemented to submit a
> job like this. You could ask for example -lnodes=10:ppn=2:cpp=4 to get
> 10 nodes with 2 processes per node and 4 CPUs per process. Then PBS Pro
> would generate a $PBS_NODEFILE containing all the nodes repeated twice
> (because of ppn=2) and also set the environment variables NCPUS and
> OMP_NUM_THREADS to 4. With this, it would really allocate 8 virtual
> processors per node (ppn*cpp). With Torque, you have to fiddle with
> $PBS_NODEFILE to make it work with hybrid parallel programs.
> True enough, PBS Pro did not preclude running more processes than
> requested, but ppn meant processes per node (at least in the restricted
> MPI sense) and that is what the documentation said.
And this is what makes me shudder: redefining ppn to mean "processes
per node" instead of the current meaning of (virtual) "processors
per node". I have a strong objection towards that redefinition - this
has the potential of breaking many sites.
I recognize that if this syntax is implemented such that the default
for cpp is 1, then this reverts back to the existing syntax.
Thus, it may not be so bad. I am worried though.
> Later, they introduced -lselect and deprecated -lnodes altogether. Now
> one can ask for -lselect=10:ncpus=8:mpiprocs=2:ompthread=4 to get the
> same result, if I remember correctly, but I think that I liked ppn and
> cpp better...
I do like the -lselect syntax in principle as long as can be
introduced as an alternative to the -l nodes=x:ppn=y syntax.
A few comments: I do not mind to use "ncpus", but others might:
ncpus has a long history and I do not know whether anybody is still
using it. In my opinion it is mostly broken, thus reusing "ncpus"
for something else is not a big deal.
I do not like "mpiprocs" and "ompthread": there can be "procs" and
"threads" other than "mpi" and "omp". We can use "threads" instead
of "ompthread", but we cannot use "procs" instead of "mpiprocs" -
that is taken already. Maybe we could use "nprocs" instead?
(and no: we haven't used PBS Pro for ages, thus for us there is
absolutely no need to introduce a new syntax that is "PBS Pro
compliant"; it is much more important that submission scripts that
work now will work in the future in the same way).
Head, Research Computing
WestGrid/ComputeCanada Site Lead
IT Services phone: 778 782-4691
Simon Fraser University fax: 778 782-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the torquedev