[Mauiusers] Help partitioning cluster using QoS
J.W.Jones at swansea.ac.uk
Thu Jun 28 07:39:07 MDT 2007
I have been trying for a few days now to partition our cluster using QoS
rather than Maui partitions but have been unsuccessful.
I am using OpenPBS 2.3.12 and Maui 3.2.6p13.
We have a heterogenous cluster consisting of 22 nodes.
Nodes 1 - 8 are older, single core, dual CPU machines and are available
for the whole centre to run on. For these nodes I have a number of
queues: cpu1, cpu2, cpu4 and cpu8 with differing maximum runtimes for
each (the numbers are maximum number of CPUs). For these nodes I also
try stop queue stuffing by using:
USERCFG[DEFAULT] MAXPROC=8 MAXJOB=4 MAXIPROC=8 MAXIJOB=4
Nodes 9 - 22 have recently been purchased by a particular research group
and are dual-cpu, dual core machines with faster CPUs. These machines,
at least for the short term, will only be accessible to a small group of
users. These nodes have one queue, cfd, which will have less restrictive
USERCFG[DEFAULT] MAXPROC=32 MAXJOB=32 MAXIPROC=32 MAXIJOB=32
In the near future the cluster will be extended again by another group
with the same provisos that their nodes will be for their own group for
the forseeable future.
I was hoping to achieve this by setting up two standing reservations,
public and cfd, which would reserve all of the nodes:
HOSTLIST="^node[1-8]$" PERIOD=INFINITY QOSLIST=public_qos
HOSTLIST="^node(9|1[0-9]|2[0-1])$" PERIOD=INFINITY QOSLIST=cfd_qos
then define two QoS, one for each reservation, that restricts jobs to
the appropriate set of nodes:
then set up each class to have the appropriate default QoS and use the
CLASSCFG[cpu1] FLAGS=ADVRES:public_res.0.0 QDEF=public_qos
CLASSCFG[cpu2] FLAGS=ADVRES:public_res.0.0 QDEF=public_qos
CLASSCFG[cpu4] FLAGS=ADVRES:public_res.0.0 QDEF=public_qos
CLASSCFG[cpu8] FLAGS=ADVRES:public_res.0.0 QDEF=public_qos
CLASSCFG[cfd] FLAGS=ADVRES:cfd_res.0.0 QDEF=cfd_qos
then set differing throttles for users in the diferent QoS's:
USERCFG[DEFAULT] QDEF=public_qos MAXPROC[QOS:public_qos]=8
USERCFG[DEFAULT] MAXPROC[QOS:cfd_qos]=32 MAXJOB[QOS:cfd_qos]=32
and then finally allow the users of the CFD group access to their
resources by either:
GROUPCFG[cfd] QDEF=cfd_qos, QLIST=public_qos, cfd_qos
or lots of:
USERCFG[username] QDEF=cfd_qos, QLIST=public_qos, cfd_qos
the GROUP one being favourite as there are a lot of users.
The end result I wanted was to automatically:
- Set any jobs in queues cpu[1-8] to use the first set of nodes
- Set any jobs in queue cfd to use the second set of nodes
- Allow any user to subkit to queues cpu[1-8] with the MAXJOB, MAXPROC
values being more restrictive
- Allow the cfd group to submit to the cpu[1-8] queues and the cfd
queue, with the MAXJOB, MAXPROC values being less restrictive.
Any help would be appreciated as I am coming up blank.
Dr Jason W Jones
School of Engineering
University of Wales Swansea
SA2 8PP UK
j.w.jones at swansea.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mauiusers