13.2.1 Defining and Configuring Resource Manager Interfaces
Moab's resource manager interface(s) are defined using the RMCFG parameter. This parameter allows specification of key aspects of the interface. In most cases, only the TYPE attribute needs to be specified and Moab determines the needed defaults required to activate and use the selected interface. In the following example, an interface to a Loadleveler resource manager is defined.
Note that the resource manager is given a label of orion. This label can be any arbitrary site-selected string and is for local usage only. For sites with multiple active resource managers, the labels can be used to distinguish between them for resource manager specific queries and commands.
13.2.1.1 Resource Manager Attributes
The following table lists the possible resource manager attributes that can be configured.
Normally, when the JOBSUBMITURL is executed, Moab will drop to the UID and GID of the user submitting the job. Specifying an ADMINEXEC of jobsubmit causes Moab to use its own UID and GID instead (usually root). This is useful for some native resource managers where the JOBSUBMITURL is not a user command (such as qsub) but a script that interfaces directly with the resource manager.
EXAMPLE
Moab will not drop to the user's UID and GID before executing the JOBSUBMITURL.
ATTRIBUTE
AUTHALIST
FORMAT
comma delimited list of local account names
DEFAULT
ALL (all accounts may use resources)
DESCRIPTION
Specifies which local accounts are allowed to use reported resource manager
resources. In the case of multi-resource manager usage, only the authorized account list on the master resource manager is considered. In the case of peer resource managers, jobs are only migrated if allowed by the AUTHALIST parameter and by all policies on the destination cluster.
EXAMPLE
Only jobs from accounts er342 and ex332 are considered for execution on resources reported by the resource manager. base
ATTRIBUTE
AUTHCLIST
FORMAT
comma delimited list of local class names
DEFAULT
ALL (all classes may utilize resources)
DESCRIPTION
Specifies which local classes are allowed to use reported resource manager
resources. In the case of multi-resource manager usage, only the authorized class list on the master resource manager is considered. In the case of peer resource managers, jobs are only migrated if allowed by the AUTHCLIST parameter and by all policies on
the destination cluster.
EXAMPLE
Only jobs from classes fast and special are considered for execution on resources reported by the resource manager base.
ATTRIBUTE
AUTHGLIST
FORMAT
comma delimited list of local group names
DEFAULT
ALL (all groups may use resources)
DESCRIPTION
Specifies which local groups will be allowed to use reported resource manager
resources. In the case of multi-resource manager usage, only the authorized group
list on the master resource manager is considered. In the case of peer resource managers, jobs
are only migrated if allowed by the AUTHGLIST parameter and by all policies on
the destination cluster.
EXAMPLE
Only jobs from groups staff and development are considered for execution on resources reported by the resource manager base.
ATTRIBUTE
AUTHQLIST
FORMAT
comma delimited list of local QOS names
DEFAULT
ALL (all QoS's may use resources)
DESCRIPTION
Specifies which local QoS's are allowed to use reported resource manager
resources. In the case of multi-resource manager usage, only the authorized QoS
list on the master resource manager is considered. In the case of peer resource managers, jobs
are only migrated if allowed by the AUTHQLIST parameter and by all policies on
the destination cluster.
EXAMPLE
Only jobs from QOS's prio2 and prio3 will be considered for execution on resources reported by the resource manager base
ATTRIBUTE
AUTHTYPE
FORMAT
one of CHECKSUM, OTHER, PKI, SECUREPORT, or
NONE.
DEFAULT
CHECKSUM
DESCRIPTION
Specifies the security protocol to be used in scheduler-resource manager communication.
NOTE: Only valid with WIKI based interfaces.
EXAMPLE
Moab requires a secret key based checksum associated with each resource manager message.
ATTRIBUTE
AUTHULIST
FORMAT
comma delimited list of local user names
DEFAULT
ALL (all users may use resources)
DESCRIPTION
Specifies which local users are allowed to use reported resource manager
resources. In the case of multi-resource manager usage, only the authorized user list on the master resource manager is considered. In the case of peer resource managers, jobs are only migrated if allowed by the AUTHULIST parameter and by all policies on the destination cluster.
EXAMPLE
Only jobs from users steve and john are considered for execution on resources reported by the resource manager base.
ATTRIBUTE
BANDWIDTH
FORMAT
<FLOAT>[{M|G|T}]
DEFAULT
-1 (unlimited)
DESCRIPTION
Specifies the maximum deliverable bandwidth between the Moab server and the resource
manager for staging jobs and data. Bandwidth is specified in units per second and defaults to a unit of MB/s. If a unit modifier is specified, the value is interpreted accordingly (M - megabytes/sec, G - gigabytes/sec, T - terabytes/sec).
EXAMPLE
Moab will reserve up to 340 GB of network bandwidth when scheduling job and data staging operations to and from this resource manager.
ATTRIBUTE
CHECKPOINTSIG
FORMAT
one of suspend, <INTEGER>, or SIG<X>
DEFAULT
---
DESCRIPTION
Specifies what signal to send the resource manager when a job is checkpointed. (See Checkpoint Overview.)
EXAMPLE
Moab routes the signal SIGKILL through the resource manager to the job when a job is checkpointed.
ATTRIBUTE
CHECKPOINTTIMEOUT
FORMAT
[[[DD:]HH:]MM:]SS
DEFAULT
0 (no timeout)
DESCRIPTION
Specifies how long Moab waits for a job to checkpoint before canceling it.
If set to 0, Moab does not cancel the job if it fails to checkpoint. (See Checkpoint Overview.)
EXAMPLE
Moab cancels any job that has not exited 5 minutes after receiving a checkpoint request.
ATTRIBUTE
CLIENT
FORMAT
<PEER>
DEFAULT
use name of resource manager for peer client lookup
DESCRIPTION
If specified, the resource manager will use the peer value to authenticate remote
connections. (See configuring peers). If not specified, the resource manager will search
for a CLIENTCFG entry of
RM:<RMNAME> in the moab-private.cfg file.
EXAMPLE
Moab will look up and use information for peer clusterB when
authenticating the clusterBI resource manager.
ATTRIBUTE
CLUSTERQUERYURL
FORMAT
[file://<path> | http://<address> | <path>]
If file:// is specified, Moab treats the destination as a flat text file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file; if just a path is specified, Moab treats the destination as an executable.
Moab reads /tmp/cluster.config when it queries base resource manager.
ATTRIBUTE
CONFIGFILE
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the resource manager specific configuration file that must be used to enable correct API communication.
NOTE: Only valid with LL- and SLURM-based interfaces.
EXAMPLE
The scheduler uses the specified file when establishing the resource manager/scheduler interface connection.
ATTRIBUTE
DATARM
FORMAT
<RM NAME>
DEFAULT
N/A
DESCRIPTION
If specified, the resource manager uses the given storage resource manager to handle staging data in and out.
EXAMPLE
When data staging is required by jobs starting/completing on clusterB, Moab uses the storage interface
defined by clusterB_storage to stage and monitor the data.
ATTRIBUTE
DEFAULTCLASS
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the class to use if jobs submitted via this resource manager interface do not have an associated class.
EXAMPLE
Moab assigns the class batch to all jobs from the base resource manager that do not have a class assigned.
NOTE: If you are using PBS as the resource manager, a job will never come from PBS without a class, and the default will never apply.
ATTRIBUTE
DEFAULT.JOB
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the job template to use to set various job attributes that are not specified by the submittor.
EXAMPLE
Moab uses the defjob job template to identify and apply job attribute defaults.
ATTRIBUTE
DEFAULTHIGHSPEEDADAPTER
FORMAT
<STRING>
DEFAULT
sn0
DESCRIPTION
Specifies the default high speed switch adapter to use when starting LoadLeveler jobs (supported in version 4.2.2 and higher of Moab and 3.2 of LoadLeveler).
EXAMPLE
RMCFG[base] DEFAULTHIGHSPEEDADAPTER=sn1
(The scheduler will start jobs requesting a high speed adapter on sn1.)
ATTRIBUTE
DESCRIPTION
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the human-readable description for the resource manager interface. If white space is used, the description should be quoted.
EXAMPLE
Moab annotates the ganglia resource manager accordingly.
ATTRIBUTE
ENV
FORMAT
Semi-colon delimited (;) list of <KEY>=<VALUE> pairs
DEFAULT
MOABHOMEDIR=<MOABHOMEDIR>
DESCRIPTION
Specifies a list of environment variables that will be passed to URLs of type 'exec://' for that resource manager.
EXAMPLE
The environment variables HOST and RETRYTIME (with values 'node001' and '50' respectively) are passed to the /opt/moab/tools/cluster.query.pl and /opt/moab/tools/workload.query.pl when they are executed.
comma delimited list of zero or more of the following: asyncstart, autostart, autosync,
client, fullcp, executionServer, grid, hostingCenter,
ignqueuestate, loadbalance, private, report, shared, slavepeer or static
DEFAULT
N/A
DESCRIPTION
Specifies various attributes of the resource manager. See Flag Details for more information.
EXAMPLE
Moab uses this resource manager to perform a single update of node and job objects reported elsewhere.
ATTRIBUTE
FLOWINTERVAL
FORMAT
[[[DD:]HH:]MM:]SS
DEFAULT
01:00:00 (one hour)
DESCRIPTION
Specifies the duration of the flow control sliding window.
(The
scheduler limits jobs running on this resource manager to no more than 30 jobs every 30 minutes.)
ATTRIBUTE
FNLIST
FORMAT
comma delimited list of zero or more of the following: clusterquery, jobcancel, jobrequeue, jobresume, jobstart, jobsuspend, queuequery, resourcequery or workloadquery
DEFAULT
N/A
DESCRIPTION
By default, a resource manager utilizes all functions supported to query and control batch objects. If this parameter is specified, only the listed functions are used.
EXAMPLE
Moab only uses this resource manager interface to load queue configuration information.
ATTRIBUTE
JOBCANCELURL
FORMAT
<protocol>://[<host>[:<port>]][<path>]
DEFAULT
---
DESCRIPTION
Specifies how Moab cancels jobs via the resource manager. (See URL Notes that follow.)
EXAMPLE
Moab executes /opt/moab/job.cancel.lsf.pl to cancel specific jobs.
Specifies the minimum and maximum amount of time that can be added to a job's walltime
if it is possible for the job to be extended. (See MINWCLIMIT.) As the job runs
longer than its current specified minimum wallclock limit (-l minwclimit, for example), Moab attempts to extend the job's limit by the minimum JOBEXTENDDURATION. This continues until either the extension can no longer occur (it is blocked by a
reservation or job), the maximum JOBEXTENDDURATION is reached, or the
user's specified wallclock limit (-l wallclock) is reached. When a job is extended,
it is marked as PREEMPTIBLE, unless the '!' is appended to the end of the configuration string. If the
'<' is at the end of the string, however, the job is extended the maximum amount possible.
EXAMPLE
Moab extends a job's walltime by 30 seconds each time the job is about to run out of walltime until it is bound by one hour, a reservation/job, or the job's original "maximum" wallclock limit.
Moab executes /opt/moab/job.modify.dyn.pl to modify specific jobs.
ATTRIBUTE
JOBPREEMPTURL
FORMAT
<protocol>://[<host>[:<port>]][<path>]
DEFAULT
---
DESCRIPTION
Specifies how Moab preempts jobs via the resource manager. (See URL Notes that follow.)
EXAMPLE
Moab executes /opt/moab/job.preempt.condor.pl to preempt specific jobs.
ATTRIBUTE
JOBRSVRECREATE
FORMAT
Boolean
DEFAULT
TRUE
DESCRIPTION
Specifies whether Moab will re-create a job reservation each time job information is updated by a resource manager. (See Considerations for Large Clusters for more information.)
EXAMPLE
Moab only creates a job reservation once when the job first starts.
ATTRIBUTE
JOBSTAGEMETHOD
FORMAT
one of globus, local, or other
DEFAULT
local
DESCRIPTION
Specifies how Moab stages jobs from the server to the resource manager in both local clusters and peer based grids. (See Configuring a Grid with Globus.)
EXAMPLE
RMCFG[base] JOBSTAGEMETHOD=globus
ATTRIBUTE
JOBSTARTURL
FORMAT
<protocol>://[<host>[:<port>]][<path>]
DEFAULT
---
DESCRIPTION
Specifies how Moab starts jobs via the resource manager. (See URL Notes that follow.)
EXAMPLE
Moab triggers the jobstart.cgi script via http to start specific jobs.
ATTRIBUTE
JOBSUBMITURL
FORMAT
<protocol>://[<host>[:<port>]][<path>]
DEFAULT
---
DESCRIPTION
Specifies how Moab submits jobs to the resource manager. (See URL Notes that follow.)
EXAMPLE
Moab submits jobs directly to the database located on host dbserver.flc.com.
(The scheduler loads up to 200 active jobs from the
remote Moab peer cluster1.)
ATTRIBUTE
MINETIME
FORMAT
<INTEGER>
DEFAULT
1
DESCRIPTION
Specifies the minimum time in seconds between processing subsequent scheduling events.
EXAMPLE
RMCFG[base] MINETIME=5
(The scheduler batch-processes scheduling events that occur less than five seconds apart.)
ATTRIBUTE
MIN.JOB
FORMAT
<STRING>
DEFAULT
---
DESCRIPTION
Specifies the job template to use to check various minimum/required job attributes that are specified by the submittor.
EXAMPLE
Moab uses the minjob job template to identify and enforce minimum/required job attributes.
ATTRIBUTE
NMPORT
FORMAT
<INTEGER>
DEFAULT
(any valid port number)
DESCRIPTION
Specifies a non-default resource manager node manager through which extended node attribute information may be obtained.
EXAMPLE
RMCFG[base] NMPORT=13001
(The scheduler contacts the node manager located on each compute node at port 13001.)
ATTRIBUTE
NODEFAILURERSVPROFILE
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the rsv template to use when placing a reservation onto failed nodes. (See also NODEFAILURERESERVETIME.)
EXAMPLE
The scheduler will use the long rsv profile when creating reservations over failed nodes belonging to base.
ATTRIBUTE
OMAP
FORMAT
<protocol>://[<host>[:<port>]][<path>]
DEFAULT
---
DESCRIPTION
Specifies an object map file that is used to map credentials and other objects when
using this resource manager peer. (See Grid Credential Management for full details.)
EXAMPLE
When communicating with the resource manager peer1, objects are mapped according to the rules defined in the /opt/moab/omap.dat file.
ATTRIBUTE
POLLINTERVAL
FORMAT
[[[DD:]HH:]MM:]SS
DEFAULT
30
DESCRIPTION
Specifies how often the scheduler will poll the resource manager for information.
EXAMPLE
Moab contacts resource manager base every minute for updates.
ATTRIBUTE
POLLTIMEISRIGID
FORMAT
{TRUE|FALSE}
DEFAULT
FALSE
DESCRIPTION
Determines whether the POLLINTERVAL parameter is interpreted as an interval or a set time for contacting.
EXAMPLE
Moab polls the resource manager at startup and on the hour.
ATTRIBUTE
PORT
FORMAT
<INTEGER>
DEFAULT
0
DESCRIPTION
Specifies the port on which the scheduler should contact the associated resource manager. The value '0' specifies that the resource manager default port should be used.
EXAMPLE
Moab attempts to contact the PBS server daemon on host cws, port 20001.
ATTRIBUTE
RESOURCETYPE
FORMAT
{COMPUTE|FS|LICENSE|NETWORK}
DEFAULT
---
DESCRIPTION
Specifies which type of resource this resource manager is configured to control. See Native Resource Managers for more information.
EXAMPLE
Resource manager base will function as a NATIVE resource manager and control file systems.
ATTRIBUTE
RMSTARTURL
FORMAT
[exec://<path> | http://<address> |
<path>]
If exec:// is specified, Moab treats the destination as an executable file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file.
DEFAULT
---
DESCRIPTION
Specifies how Moab starts the resource manager.
EXAMPLE
Moab executes /tmp/nat.start.pl to start the resource manager base.
ATTRIBUTE
RMSTOPURL
FORMAT
[exec://<path> | http://<address> |
<path>]
If exec:// is specified, Moab treats the destination as an executable file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file.
DEFAULT
---
DESCRIPTION
Specifies how Moab stops the resource manager.
EXAMPLE
Moab executes /tmp/nat.stop.pl to stop the resource manager base.
ATTRIBUTE
SBINDIR
FORMAT
<PATH>
DEFAULT
N/A
DESCRIPTION
For use with TORQUE; specifies the location of the TORQUE system binaries (supported in TORQUE 1.2.0p4 and higher).
EXAMPLE
Moab tells TORQUE that its system binaries are located in /usr/local/torque/sbin.
ATTRIBUTE
SERVER
FORMAT
<URL>
DEFAULT
N/A
DESCRIPTION
Specifies the resource management service to use. If not specified, the scheduler locates the resource manager via built-in defaults or, if available, with an information service.
EXAMPLE
Moab attempts to use the Loadleveler scheduling API at the specified location.
ATTRIBUTE
SET.JOB
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the job template to use to force various job attributes regardless of whether or not they are specified by the submittor.
EXAMPLE
Moab uses the setjob job template to identify and enforce mandatory job attributes.
ATTRIBUTE
SLURMFLAGS
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies characteristics of the SLURM resource manager interface.
EXAMPLE
Moab uses the specified flag to determine interface characteristics with SLURM. The COMPRESSOUTPUT flag instructs Moab to use the compact hostlist format for job submissions to SLURM. The flag NODEDELTAQUERY instructs Moab to request delta node updates when it queries SLURM for node configuration.
ATTRIBUTE
SOFTTERMSIG
FORMAT
<INTEGER> or SIG<X>
DEFAULT
---
DESCRIPTION
Specifies what signal to send the resource manager when a job reaches its soft wallclock limit. (See JOBMAXOVERRUN.)
EXAMPLE
Moab routes the signal SIGUSR1 through the resource manager to the job when a job reaches its soft wallclock limit.
ATTRIBUTE
STAGETHRESHOLD
FORMAT
[[[DD:]HH:]MM:]SS
DEFAULT
N/A
DESCRIPTION
Specifies the maximum time a job waits to start locally before considering being migrated to a remote peer. In other words, if a job's start time on a remote cluster is less than the start time on the local cluster, but the difference between the two is less than STAGETHRESHOLD, then the job is scheduled locally. The aim is to avoid job/data staging overhead if the difference in start times is minimal. NOTE: If this attribute is used, backfill is disabled for the associated resource manager.
EXAMPLE
Moab only migrates jobs to remote_cluster if the jobs can start five minutes sooner on the remote cluster than they could on the local cluster.
ATTRIBUTE
STARTCMD
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the full path to the resource manager job start client. If the resource manager API fails, Moab executes the specified start command in a second attempt to start the job.
NOTE: Moab calls the start command with the format '<CMD> <JOBID> -H <HOSTLIST>' unless the environment variable 'MOABNOHOSTLIST' is set in which case Moab will only pass the job ID.
EXAMPLE
Moab uses the specified start command if API failures occur when launching jobs.
ATTRIBUTE
SUBMITCMD
FORMAT
<STRING>
DEFAULT
N/A
DESCRIPTION
Specifies the full path to the resource manager job submission client.
EXAMPLE
Moab uses the specified submit command when migrating jobs.
ATTRIBUTE
SUBMITPOLICY
FORMAT
one of NODECENTRIC or PROCCENTRIC
DEFAULT
PROCCENTRIC
DESCRIPTION
If set to NODECENTRIC, each specified node requested by the job is interpreted as a true compute host, not as a task or processor.
EXAMPLE
Moab uses the specified submit policy when migrating jobs.
ATTRIBUTE
SUSPENDSIG
FORMAT
<INTEGER> (valid UNIX signal between 1 and 64)
DEFAULT
--- (resource manager specific default)
DESCRIPTION
If set, Moab sends the specified signal to a job when a job suspend request is issued.
EXAMPLE
Moab uses the specified suspend signal when suspending jobs within the base resource manager.
NOTE: SUSPENDSIG should not be used with TORQUE or other PBS-based resource managers.
ATTRIBUTE
SYNCJOBID
FORMAT
<BOOLEAN>
DEFAULT
---
DESCRIPTION
Specifies that Moab should migrate jobs to the local resource manager queue with a job ID matching the job's Moab-assigned job ID (only available with SLURM.)
EXAMPLE
Moab migrates jobs to the SLURM queue with a jobid matching the Moab-assigned job ID.
ATTRIBUTE
SYSTEMMODIFYURL
FORMAT
[exec://<path> | http://<address> |
<path>]
If exec:// is specified, Moab treats the
destination as an executable file; if http:// is specified, Moab treats the
destination as a hypertext transfer protocol file.
DEFAULT
---
DESCRIPTION
Specifies how Moab modifies attributes of the system. This interface is used in Data Staging.
EXAMPLE
Moab executes /tmp/system.modify.pl when it modifies system attributes in conjunction with the resource manager base.
ATTRIBUTE
SYSTEMQUERYURL
FORMAT
[file://<path> | http://<address> | <path>]
If file:// is specified, Moab treats the destination as a flat text file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file; if just a path is specified, Moab treats the destination as an executable.
DEFAULT
---
DESCRIPTION
Specifies how Moab queries attributes of the system. This interface is used in Data Staging.
EXAMPLE
Moab reads /tmp/system.query when it queries the system in conjunction with base resource manager.
ATTRIBUTE
TARGETUSAGE
FORMAT
<INTEGER>[%]
DEFAULT
90%
DESCRIPTION
Amount of resource manager resources to explicitly use. In the case of a
storage resource manager, indicates the target usage of data storage resources to dedicate to active data migration requests. If the specified value contains a percent sign (%), the target value is a percent of the configured value. Otherwise, the target value is considered to be an absolute value measured in megabytes (MB).
EXAMPLE
Moab schedules data migration requests to never exceed 80% usage of the storage resource manager's disk cache and network resources.
ATTRIBUTE
TIMEOUT
FORMAT
<INTEGER>
DEFAULT
30
DESCRIPTION
Time (in seconds) the scheduler waits for a response from the resource manager.
EXAMPLE
Moab waits 40 seconds to receive a response from the resource manager before timing out and giving up. Moab tries again on the next iteration.
ATTRIBUTE
TRANSLATIONSCRIPT
FORMAT
<STRING>
DEFAULT
---
DESCRIPTION
Script to be inserted into user command file if migration involves translation from one resource manager type to another. The script takes two arguments where the first is the source resource manager type and the second is the destination resource manager type. Types can be any of the following: PBS, SLURM, LSF, SGE, CONDOR, LOADLEVELER, or BPROC.
EXAMPLE
Moab inserts a line that will source the specified script into the start of each translated job command file.
ATTRIBUTE
TRIGGER
FORMAT
<TRIG_SPEC>
DEFAULT
---
DESCRIPTION
A trigger specification indicating behaviors to enforce in the event of certain events associated with the resource manager, including resource manager start, stop, and failure.
EXAMPLE
Moab waits 40 seconds to receive a response from the resource manager before timing out and giving up. Moab tries again on the next iteration.
ATTRIBUTE
TYPE
FORMAT
<RMTYPE>[:<RMSUBTYPE>] where <RMTYPE> is one of the following:
LL, LSF, NATIVE, PBS, RMS, SGE, SSS, or WIKI and the optional <RMSUBTYPE> value is one of RMS.
DEFAULT
PBS
DESCRIPTION
Specifies type of resource manager to be contacted by the scheduler.
NOTE: For TYPEWIKI, AUTHTYPE must be set to CHECKSUM. The <RMSUBTYPE> option is currently only used to support Compaq's RMS resource manager in conjunction with PBS. In this case, the value PBS:RMS should be specified.
EXAMPLE
Moab interfaces to two different PBS resource managers, one located on server clusterA at port 15003 and one located on server clusterB at port 15005.
The resource manager allocates two processors at a time for a period of at least 30 minutes if the cluster maintains a 20-minute backlog for more than three minutes.
ATTRIBUTE
UCALLOCSIZE
FORMAT
<INTEGER> (processors)
DEFAULT
1
DESCRIPTION
Specifies the number of additional nodes to allocate each time a dynamic utility computing threshold is reached. This feature is used primarily with utility computing resources.
NOTE: Either UCMAXSIMZE or UCALLOCSIZE must be specified to enable performance or threshold based automatic utility computing usage.
EXAMPLE
The resource manager allocates four additional nodes each time the utility computing threshold is reached.
ATTRIBUTE
UCMAXSIZE
FORMAT
<INTEGER> (processors)
DEFAULT
1
DESCRIPTION
Specifies the maximum number of nodes the local cluster can allocate in response to utility computing thresholds. This feature is used primarily with utility computing resources.
NOTE: Either UCMAXSIMZE or UCALLOCSIZE must be specified to enable performance or threshold based automatic utility computing usage.
EXAMPLE
Moab may not allocate more than a total of 256 processors from the utility computing resource even if the utility computing threshold is in violation.
ATTRIBUTE
UCTHRESHOLD
FORMAT
[[[DD:]HH:]MM:SS
DEFAULT
--- (no activation threshold)
DESCRIPTION
Specifies the cluster backlog duration required before the resource manager automatically activates. This feature is used primarily to active utility computing resources.
NOTE: This parameter is required to enable performance or threshold based automatic utility computing usage.
EXAMPLE
The resource manager should be activated if the cluster obtains a 20 minute backlog.
ATTRIBUTE
UCTHRESHOLDDURATION
FORMAT
[[[DD:]HH:]MM:SS
DEFAULT
--- (no threshold duration)
DESCRIPTION
Specifies how long the resource manager's UCTHRESHOLD must be satisfied before resource manager activation is allowed. This parameter prevents statistical spikes from causing unnecessary utility computing allocations. This feature is used primarily to activate utility computing resources.
EXAMPLE
Utility computing resources should be allocated if the cluster maintains a 20 minute backlog for more than three minutes.
ATTRIBUTE
VARIABLE
FORMAT
<VAR>=<VAL>[,VAR>=<VAL>]
DEFAULT
---
DESCRIPTION
Opaque resource manager variables.
EXAMPLE
Moab associates the variable SCHEDDHOST with the value head1 on resource manager base.
ATTRIBUTE
VERSION
FORMAT
<STRING>
DEFAULT
SLURM: 10200 (i.e., 1.2.0)
DESCRIPTION
Resource manager-specific version string.
EXAMPLE
Moab assumes that resource manager base has a version number of 1.1.24.
ATTRIBUTE
WORKLOADQUERYURL
FORMAT
[file://<path> | http://<address> | <path>]
If file:// is specified, Moab treats the destination as a flat text file; if http:// is specified, Moab treats the destination as a hypertext transfer protocol file; if just a path is specified, Moab treats the destination as an executable.
Moab executes /opt/moab/tools/job.query.dyn.pl to obtain updated workload information from resource manager dynamic_jobs.
NOTE: For the protocol file, Moab loads the data directly from the text file pointed to by path. For the protocol exec, Moab executes the file pointed to by path and loads the output written to STDOUT. For the protocol http, Moab loads the data from the web based HTTP (hypertext transfer protocol) destination. For the protocol sql, Moab loads the data from the specified database.
13.2.2 Resource Manager Configuration Details
As with all scheduler parameters, RMCFG follows the syntax described within the Parameters Overview.
13.2.2.1 Resource Manager Types
The RMCFG parameter allows the scheduler to interface to multiple types of resource managers using the TYPE or SERVER attributes. Specifying these attributes, any of the following listed resource managers may be supported. To further assist in configuration, Integration Guides are provided for PBS, SGE, and Loadleveler systems.
TYPE
Resource Managers
Details
LL
Loadleveler version 2.x and 3.x
N/A
LSF
Platform's Load Sharing Facility, version 5.1 and higher
N/A
Moab
Moab Workload Manager
Use the Moab peer-to-peer (grid) capabilities to enable grids and other configurations. (See Grid Configuration.)
Scalable Systems Software Project version 2.0 and higher
N/A
WIKI
Wiki interface specification version 1.0 and higher
Used for LRM, YRM, ClubMASK, BProc, and others.
13.2.2.2 Resource Manager Name
Moab can support more than one resource manager simultaneously. Consequently, the RMCFG parameter takes an index value such as RMCFG[clusterA] TYPE=PBS. This index value essentially names the resource manager (as done by the deprecated parameter RMNAME. The resource manager name is used by the scheduler in diagnostic displays, logging, and in reporting resource consumption to the allocation manager. For most environments, the selection of the resource manager name can be arbitrary.
13.2.2.3 Resource Manager Location
The HOST, PORT, and SERVER attributes can be used to specify how the resource manager should be contacted. For many resource managers (such as OpenPBS, PBSPro, Loadleveler, SGE, and LSF) the interface correctly establishes contact using default values. These parameters need only to be specified for resource managers such as the WIKI interface (that do not include defaults) or with resources managers that can be configured to run at non-standard locations (such as PBS). In all other cases, the resource manager is automatically located.
13.2.2.4 Resource Manager Flags
The FLAGS attribute can be used to modify many aspects of a resources manager's behavior.
Flag
Description
asyncstart
Jobs started on this resource manager start asynchronously. In this case, the scheduler does not wait for confirmation that the job correctly starts before proceeding. (See Large Cluster Tuning for more information.)
autostart
Jobs staged to this resource manager do not need to be explicitly started by the scheduler. The resource manager itself handles job launch.
autosync
Resource manager starts and stops together with Moab.
NOTE: This requires that the resource manager support a resource manager start and stop API or the RMSTARTURL and RMSTOPURL attributes are set.
client
A client resource manager object loads no data and provides no services. It is created for diagnostic and statistical purposes only. A client resource manager is created to represent an external entity that is consuming server resources or services and allows a local administrator to track this usage.
dynamicCred
The resource manager creates credentials within the cluster as needed to support workload. (See Identity Manager Overview.)
executionServer
The resource manager is capable of launching and executing batch workload.
fullcp
Always checkpoint full job information (useful with Native resource managers).
The resource manager masks details about local workload and resources and presents only information relevant to the remote server.
hostingCenter
The resource manager interface is used to negotiate an adjustment in dynamic resource access.
ignQueueState
The queue state reported by the resource manager should be ignored. May be used if queues must be disabled inside of a particular resource manager to allow an external scheduler to properly operate.
loadBalance
N/A
localRsvExport
N/A
noautores
If the resource manager does not report CPU usage to Moab because CPU usage is at 0%, Moab assumes full CPU usage. When set, Moab recognizes the resource manager report as 0% usage.
private
The resources and workload reported by the resource manager are not reported to non-administrator users.
report
N/A
rootSubmit
Moab submits jobs to the resource manager using root's credentials and environment.
Resources of this resource manager may be scheduled by multiple independent sources and may not be assumed to be owned by any single source.
slavepeer
Information from this resource manager may not be used to identify new jobs or nodes. Instead, this information may only be used to update jobs and nodes discovered and loaded from other non-slave resource managers.
static
This resource manager only provides partial object information and this information does not change over time. Consequently, this resource manager may only be called once per object to modify job and node information.
Example
13.2.2.5 Other Attributes
The maximum amount of time Moab waits on a resource manager call can be controlled by the TIMEOUT parameter that defaults to 30 seconds. Only rarely will this parameter
need to be changed. The AUTHTYPE attribute allows specification of how security over the scheduler/resource manager interface is to be handled. Currently, only the WIKI interface is affected by this parameter.
Another RMCFG attribute is CONFIGFILE, which specifies the location of the resource manager's primary configuration file and is used when detailed resource manager information not available via the scheduling interface is required. It is currently only used with the Loadleveler interface and may be specified when using Moab grid-scheduling capabilities.
Finally, the NMPORT attribute allows specification of the resource manager's node manager port and is only required when this port has been set to a non-default value. It is currently only used within PBS to allow MOM specific information to be gathered and utilized by Moab.
13.2.3 Scheduler/Resource Manager Interactions
In the simplest configuration, Moab interacts with the resource manager using the following four primary functions:
GETJOBINFO
Collect detailed state and requirement information about idle, running, and recently completed jobs.
GETNODEINFO
Collect detailed state information about idle, busy, and defined nodes.
STARTJOB
Immediately start a specific job on a particular set of nodes.
CANCELJOB
Immediately cancel a specific job regardless of job state.
Using these four simple commands, Moab enables nearly its entire suite of scheduling functions. More detailed information
about resource manager specific requirements and semantics for each of these commands can be found in the specific resource manager (LL, PBS, or WIKI) overviews.
In addition to these base commands, other commands are required to support advanced features such as dynamic job support, suspend/resume, gang scheduling, and scheduler initiated checkpoint restart.