[torquedev] [Bug 93] Resource management semantics of Torque need to be well defined

Martin Siegert siegert at sfu.ca
Tue Dec 7 11:08:51 MST 2010


On Mon, Dec 06, 2010 at 06:38:38PM -0500, Michel Béland wrote:
> 
> >> The fact of the matter is that ppn hasn't been clearly defined over time, and
> >> what it has become in practice is probably best described as processes per
> >> node.
> > 
> > Describing it as "processes per node" is very misleading and completely
> > inaccurate.  Take for example a multi-threaded program.  I routinely run
> > multi-threaded code on our cluster.  We have 32 cores per node, and if I run a
> > _single process_ that uses 32 threads, I request ppn=32.  If that meant
> > _processes_ I would request ppn=1 because, after all, my mult-threaded program
> > is still a single process. It is, however, using multiple-cores.
> > 
> > virtual processor per node is the correct definition of ppn - the number of
> > virtual processors will typically be set to the total number of cores on a
> > node. redefining it as processes per node will lead to problems.
> 
> The fact that Torque generates a $PBS_NODEFILE containing one line per 
> virtual processor seems to support the conclusion that ppn means virtual 
> processors. But I have always considered this behaviour broken since it 
> does not work well with programs that use MPI *and* OpenMP. Back in the 
> days when I used PBS Pro 5.3, I liked what they implemented to submit a 
> job like this. You could ask for example -lnodes=10:ppn=2:cpp=4 to get 
> 10 nodes with 2 processes per node and 4 CPUs per process. Then PBS Pro 
> would generate a $PBS_NODEFILE containing all the nodes repeated twice 
> (because of ppn=2) and also set the environment variables NCPUS and 
> OMP_NUM_THREADS to 4. With this, it would really allocate 8 virtual 
> processors per node (ppn*cpp). With Torque, you have to fiddle with 
> $PBS_NODEFILE to make it work with hybrid parallel programs.
> 
> True enough, PBS Pro did not preclude running more processes than 
> requested, but ppn meant processes per node (at least in the restricted 
> MPI sense) and that is what the documentation said.

And this is what makes me shudder: redefining ppn to mean "processes
per node" instead of the current meaning of (virtual) "processors
per node". I have a strong objection towards that redefinition - this
has the potential of breaking many sites.
I recognize that if this syntax is implemented such that the default
for cpp is 1, then this reverts back to the existing syntax.
Thus, it may not be so bad. I am worried though.

> Later, they introduced -lselect and deprecated -lnodes altogether. Now 
> one can ask for -lselect=10:ncpus=8:mpiprocs=2:ompthread=4 to get the 
> same result, if I remember correctly, but I think that I liked ppn and 
> cpp better...

I do like the -lselect syntax in principle as long as can be
introduced as an alternative to the -l nodes=x:ppn=y syntax.
A few comments: I do not mind to use "ncpus", but others might:
ncpus has a long history and I do not know whether anybody is still
using it. In my opinion it is mostly broken, thus reusing "ncpus"
for something else is not a big deal.
I do not like "mpiprocs" and "ompthread": there can be "procs" and
"threads" other than "mpi" and "omp". We can use "threads" instead
of "ompthread", but we cannot use "procs" instead of "mpiprocs" -
that is taken already. Maybe we could use "nprocs" instead?

(and no: we haven't used PBS Pro for ages, thus for us there is
absolutely no need to introduce a new syntax that is "PBS Pro
compliant"; it is much more important that submission scripts that
work now will work in the future in the same way).

- Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid/ComputeCanada Site Lead
IT Services                                phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6


More information about the torquedev mailing list