[an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive]

A.5  Case Study: Multi-Queue Cluster with QOS and Charge Rates

Overview:

    A 160 node, uniprocessor Linux cluster is to be used to support various organizations within an enterprise.  The ability to receive improved job turnaround time in exchange for a higher charge rate is required.  A portion of the system must be reserved for small development jobs at all times.

Resources:

    Compute Nodes:         128 800 MHz uniprocessor nodes w/512 MB each, running Linux 2.4
                                            32  1.2 GHz uniprocessor nodes w/2 GB each, running Linux 2.4

    Resource Manager:    OpenPBS 2.3
    Network:                      100 MB ethernet

Workload:

    Job Size:                range in size from 1 to 80 processors.

    Job Length:           jobs range in length from 15 minutes to 24 hours.

    Job Owners:         various

Constraints: (Must do)

    The management desires the following queue structure:

QueueName      Nodes        MaxWallTime   Priority     ChargeRate
-----------------------------------------------------------------
Test            <=16           00:30:00        100             1x
Serial             1            2:00:00         10             1x
Serial-Long        1           24:00:00         10             2x
Short           2-16            4:00:00         10             1x
Short-Long      2-16           24:00:00         10             2x
Med            17-64            8:00:00         20             1x
Med-Long       17-64           24:00:00         20             2x
Large          65-80           24:00:00         50             2x
LargeMem           1            8:00:00         10             4x

    For charging, management has decided to charge by job walltime since the nodes will not be shared.  Management has also dictated that 16 of the uniprocessor nodes should be dedicated to running small jobs requiring 16 or fewer nodes.  Management has also decided that it would like to allow only serial jobs to run on the large memory nodes and would like to charge these jobs at a rate of 4x.  There are no constraints on the remaining nodes.

Goals: (Should do)

    This site has goals which are focused more on a supplying a straightforward queue environment to the end users than on maximizing the scheduling performance of the system.  The Maui configuration has the primary purpose of faithfully reproducing the queue constraints above while maintaining reasonable scheduling performance in the process.

Analysis:

    Since we are using PBS as the resource manager, this should be a pretty straightforward process.  It will involve setting up an allocations manager (to handle charging), configuring queue priorities, and creating a system reservation to manage the 16 processors dedicated to small jobs, and another for managing the large memory nodes.

Configuration:

    This site has a lot going on.  There will be several aspects of configuration, however, they are not too difficult individually.

    First, the queue structure.  The best place to handle this is via the PBS configuration.  Fire up 'qmgr' and set up the nine queues described above.  PBS supports the node and walltime constraints as well as the queue priorities.  (Maui will pick up and honor queue priorities configured within PBS.  Alternatively, you can also specify these priorities directly within the Maui 'fs.cfg' file for resource managers which do not support this capability.)  We will be using QBank to handle all allocations and so, will want to configure the the 'per class charge rates' there. (Note:  QBank 2.9 or higher is required for per class charge rate support.)

    Now, two reservations are needed.  The first reservation will be for the 16 small memory nodes.  It should only allow node access to jobs requesting up to 16 processors.  In this environment, this is probably most easily accomplished with a reservation class ACL containing the queues which allow 1 - 16 node jobs.   

Monitoring:

Conclusions:

[an error occurred while processing this directive] [an error occurred while processing this directive]