Source-Destination Grid Management
Moab Workload Manager® for Grids

17.4 Source-Destination Grid Management

17.4.1 Configuring a Peer Server (Source)

Peer relationships are enabled by creating and configuring a resource manager interface using the RMCFG parameter. This interface defines how a given Moab will load resource and workload information and enforce its scheduling decisions. In non-peer cases, the RMCFG parameter points to a resource manager such as TORQUE, LSF, or SGE. However, if the TYPE attribute is set to Moab, the RMCFG parameter can be used to configure and manage a peer relationship.

17.4.1.1 Simple Master-Slave Grid

The first step to create a new peer relationship is to configure an interface to a destination Moab server. In the following example, cluster C1 is configured to be able to see and use resources from two other clusters.

moab.cfg (cluster C1)
SCHEDCFG[C1] MODE=NORMAL SERVER=head.C1.xyz.com:41111 
RMCFG[C2]    TYPE=moab   SERVER=head.C2.xyz.com:40559 
RMCFG[C3]    TYPE=moab   SERVER=head.C3.xyz.com:40559
...  

In this example, the C1 Moab will contact the Moab servers running on the C2 and C3 clusters to obtain current resource and workload info, and to manage any jobs staged to these cluster from the C1 Moab. In this default configuration, the destination Moabs running on C2 and C3 will behave exactly like a resource manager to the C1 Moab. They will report node status and support job start, cancel, preempt, and submit commands. As far as the C1 Moab is concerned, it is scheduling a standard local cluster and simply using a new interface type called moab. All scheduling tools, policies, and optimizations are available to manage this system. Site web portals and administrator tools continue to operate as if all resources were located locally. The C2 and C3 Moab servers in effect tunnel the information and commands for their local resource managers to the Moab running on C1.

In this case, one RMCFG parameter is all that is required to configure each peer relationship if standard secret key based authentication is being used and a shared default secret key exists between the source and destination Moabs. However, if peer relationships with multiple clusters are to be established and a per-peer secret key is to be used (highly recommended), then a CLIENTCFG parameter must be specified to establish the authentication mechanism. Because the secret key must be kept secure, it must be specified in the moab-private.cfg file. For the current example, a per-peer secret key could be set up by creating the following moab-private.cfg file on the C1 cluster.

moab-private.cfg (C1)
CLIENTCFG[RM:C2] KEY=fastclu3t3r  
CLIENTCFG[RM:C3] KEY=14436aaa 

Note: The key specified can be any alphanumeric value and can be locally generated or made up. The only critical aspect is that the keys specified on each end of the peer relationship match.

Additional information on the CLIENTCFG parameter can be found in the security appendix, including how to use other authentication mechanisms such as X.509 certificates. Also, the Grid Security section provides detailed information on designing, configuring, and troubleshooting peer security.

Continuing with the example, the initial source side configuration is now complete. On the destination clusters, C2 and C3, the first step is to configure authentication. If a shared default secret key exists between all three clusters, then configuration is complete and the clusters are ready to communicate. If per-peer secret keys are used (recommended), then it will be necessary to create matching moab-private.cfg files on each of the destination clusters. With this example, the following files would be required on C2 and C3 respectively:

moab-private.cfg (C2)
CLIENTCFG[RM:C1] KEY=fastclu3t3r AUTH=admin1

moab-private.cfg (C3)
CLIENTCFG[RM:C1] KEY=14436aaa AUTH=admin1

Once peer security is established, a final optional step would be to configure scheduling behavior on the destination clusters. By default, each destination cluster accepts jobs from each trusted peer. However, it will also be fully autonomous, accepting and scheduling locally submitted jobs and enforcing its own local policies and optimizations. If this is the desired behavior, then configuration is complete.

In the current example, with no destination side scheduling configuration, jobs submitted to cluster C1 can run locally, on cluster C2 or on cluster C3. However, the established configuration does not necessarily enforce a strict master-slave relationship because each destination cluster (C2 and C3) has complete autonomy over how, when, and where it schedules both local and remote jobs. Each cluster can potentially receive jobs that are locally submitted and can also receive jobs from other source Moab servers. See master-slave for more information on setting up a master-slave grid.

Further, each destination cluster will accept any and all jobs migrated to it from a trusted peer without limitations on who can run, when and where they can run, or how many resources they can use. If this behavior is either too restrictive or not restrictive enough, then destination side configuration will be required.