Moab can be used as an external scheduler for the PBS resource management system. In this configuration, PBS manages the job queue and the compute resources while Moab queries the PBS Server and the PBS MOM's to obtain up to date job and node information. Using this information, Moab directs PBS to manage jobs in accordance with specified Moab policies, priorities, and reservations.
2.0 Integration Steps
Moab manages PBS via the PBS scheduling API. The steps below describe the process for enabling Moab scheduling using this API.
Keep track of the PBS target directory, $PBSTARGDIR
2.2 Install Moab
untar Moab distribution file.
cd into the moab-<X> directory
run ./configure
specify the PBS target directory ($PBSTARGDIR from step 2.1) when queried by configure
Moab interfaces to PBS by
utilizing a few PBS libraries and include files. If you have a non-standard
PBS installation, you may need to modify Makefile and change PBSIP and PBSLP values and references as necessary for your local site configuration.
The configure script
will automatically setup Moab so that the user running configure will become
the default Primary Moab Administrator, $MOABADMIN. This can
be changed by modifying the 'ADMINCFG[1] USERS= <USERNAME>' line in the moab.cfg file. The primary administrator is the first user listed in the USERS
attribute and is the ID under which the Moab daemon will run.
Some Tru64 and IRIX systems
have a local libnet library which conflicts with PBS's libnet library.
To resolve this, try setting PBSLIB to '${PBSLIBDIR}/libnet.a -lpbs' in
the Moab Makefile.
Moab is 64 bit compatible.
If PBS/TORQUE is running in 64 bit mode, Moab will likewise need to be built
in this manner in order to utilize the PBS scheduling API. (i.e., for IRIX
compilers, add '-64' to OSCCFLAGS and OSLDFLAGS variables
in the Makefile)
2.3 General Config For All Versions of TORQUE/PBS
make $MOABADMIN a PBS admin.
by default, Moab only communicates with the pbs_server daemons and the
$MOABADMIN should be authorized to talk to this daemon. (suggestions)
(OPTIONAL) set default PBS queue, nodecount, and walltime attributes. (suggestions)
(OPTIONAL - TORQUE Only) configure TORQUE to report completed job info by setting the qmgrkeep_completed parameter, i.e.,
PBS nodes can be configured as time shared or space shared according to local needs. In almost all cases, space shared nodes will provide the desired behavior.
PBS/TORQUE supports the concept of virtual nodes. Using this feature, Moab can individually schedule processors on SMP nodes. The online TORQUE documentation describes how to set up the '$PBS_HOME/server_priv/nodes' file to enable this capability. (i.e., <NODENAME> np=<VIRTUAL NODE COUNT>)
2.3.1 Version-Specific Config For TORQUE, OpenPBS or PBSPro 6.x or earlier
Do not start the pbs_sched daemon. This is the default scheduler for PBS/TORQUE and Moab will provide this service.
Moab utilizes PBS's scheduling port to obtain real-time event information from PBS regarding job and node transitions. Leaving the default qmgr setting of 'set server scheduling=True' will allow Moab to receive and process this real-time information.
2.3.2 Version-Specific Config For PBSPro 7.1 and higher
PBSPro 7.x, 8.x and higher require that the pbs_sched daemon execute for proper operation but PBS will need to be configured to take no indpenedent action which will conflict with Moab. With these PBSPro releases, sites should allow pbs_sched to run after putting the following PBS configuration in place:
2.4 Configure Moab
By default, Moab will automatically be setup to interface with TORQUE/PBS when it is installed. Consequently, in most cases, the following steps are not required.
specify PBS as the primary resource manager by setting RMCFG[base] TYPE=PBS in moab.cfg
If a non-standard PBS installation/configuration is being used, additional Moab parameters may be required to enable the Moab/PBS interface as in the line RMCFG[base] HOST=$PBSSERVERHOST PORT=$PBSSERVERPORT (See the Resource Manager Overview for more information)
Moab's user interface port is set using the parameter SCHEDCFG and is used for user-scheduler communication. This port must be different from the PBS scheduler port used for resource manager-scheduler communication.
3.0 Current Limitations
PBS Features Not Supported by Moab
Moab supports basic scheduling of all PBS node specifications.
Moab is by default very
liberal in its interpretation of <NODECOUNT>:PPN=<X>. In its
standard configuration, Moab interprets this as 'give the job <NODECOUNT>*<X>
tasks with AT LEAST <X> tasks per node'. Set the JOBNODEMATCHPOLICY
parameter to EXACTNODE to have Moab support PBS's default allocation
behavior of <NODECOUNT> nodes with exactly <X> tasks per node.
Moab Features Not Supported by PBS
PBS does not support the
concept of a job QOS or other extended scheduling features by default.
This can be handled using the techniques described here.
See the RM Extensions Overview for
more information.
Some versions of PBS do not maintain job completion information.
An external scheduler cannot determine if the job completed successfully or if internal PBS problems occurred preventing the job from being properly updated. This problem will not in any way affect proper scheduling of jobs but may, potentially, affect scheduler statistics. If your site is prone to frequent PBS hangs, you may want to set the Moab JOBPURGETIME parameter to allow Moab to hold job information in memory for a period of time until PBS recovers. (NOTE: it is not recommended that PURGETIME be set to over 2:00).
4.0 Trouble-shooting
Common Problems:
On TRU64 systems, the PBS 'libpbs' library does not properly export a
number of symbols required by Moab. This can be worked around by modifying the Moab Makefile to link the PBS 'rm.o' object file directly into Moab.