[torquedev] [Bug 93] Resource management semantics of Torque need to be well defined

Michael Barnes Michael.Barnes at jlab.org
Tue Dec 7 13:24:58 MST 2010


On Dec 7, 2010, at 2:59 PM, Mgr. Šimon Tóth wrote:

>>>> I do not like "mpiprocs" and "ompthread": there can be "procs" and
>>>> "threads" other than "mpi" and "omp". We can use "threads" instead
>>>> of "ompthread", but we cannot use "procs" instead of "mpiprocs" -
>>>> that is taken already. Maybe we could use "nprocs" instead?
>> 
>> Personally, I don't care for these terms/nuances either.
>> 
>> In fact, I don't care for the :ppn=X syntax either, and have disabled
>> that specification at our site.  Users simply ask for nodes=X which is
>> a misnomer for "slots=X".  The "ppn" number in the nodefile is arbitrary
>> and determined by the system administrator.  On one of our clusters,
>> we oversubscribe the nodes.  ie, the "ppn" number is greater than the
>> number of physical cores.  On other clusters, we "undersubscribe" nodes
>> in that they are GPU machines and the "ppn" number is the number of
>> GPUs in the machine (not GPU cores).  We use nodesets to create boundaries
>> between machines and/or networks, and the user can specify nodes=X:label
>> if they care which machine they land on.  The users *must* be aware
>> of the machine that they are using to some degree, and the nodes=X:ppn=Y
>> syntax is not meaningful when there are GPUs with varying amount of cores,
>> CPUs optionally with different number of cores, and the nodes, slots, or
>> ncpus does not dictate the network interface that they are on, the amount
>> of local disk space (if any) nor the amount of memory on each node.
> 
> Interesting approach. We have pretty much identical situation. We have a
> heavily heterogeneous grid. But we use exclusively the nodespec for this.
> 
> So instead of requesting
> nodes=2:ncpus=4:mem=4G+3:ncpus=2:mem=2G#infiniband your users request
> nodes=2:ncpus4:mem4G:infinisite1+3:ncpus2:mem2G:infinisite1?

Hmm, No.  I believe there is the notion of "node packing", I forget, where
nodes=8 will span 2 nodes if there are only 4 "ppn"s or it will span
one node if there are 8 "ppn"s.  We never use the '+' operator to
specify different nodes.  Most node specifications look like:

nodes=128

some look like:

nodes=128:intel

They rarely get any more complex than this.

> How do you determine the amount of used resources on nodes, or do you
> just assign nodes exclusively?

On the oversubscribed nodes, the nodes are shared and we have resource
limitations, on the other nodes they are used exclusively.  We also have
a wrapper script that does all of the #PBS attributes for the users.

-mb

--
+-----------------------------------------------
| Michael Barnes
|
| Thomas Jefferson National Accelerator Facility
| Scientific Computing Group
| 12000 Jefferson Ave.
| Newport News, VA 23606
| (757) 269-7634
+-----------------------------------------------






More information about the torquedev mailing list