|
|||
17.13 Grid Data Management
17.13.1 Grid Data Management OverviewMoab provides a highly generalized data manager interface that can allow both simple and advanced data management services to be used to migrate data amongst peer clusters. Using a flexible script interface, services such as scp, NFS, and gridftp can be used to address data staging needs. This section is meant to inform about data management in a peer-to-peer environment, but uses the same data staging features that are available in a single cluster configuration. 17.13.2 Peer-to-Peer Initial Data ConfigurationAs with cluster data staging there are several models which can be used separately or in concert to manage data within a peer based grid. These models can include global file systems, replicated data servers, or need-based direct input and output data migration. When managing data in peer-to-peer systems, the same configuration semantics are used as for single cluster systems. At a high level, configuring data staging across a peer-to-peer relationship consists of configuring one or more storage managers, associating them with the appropriate peer resource managers, and then specifying data requirements at the local level--when the job is submitted. 17.13.3 Peer-to-Peer SCP Key AuthenticationIn order to use scp as the data staging protocol, we will need to create SSH keys which allow users to copy files between the two peers, without the need for passwords. For example, if UserA is present on the source peer, and his counterpart is UserB on the destination peer, then UserA will need to create an SSH key and configure UserB to allow password-less copying. This will enable UserA to copy files to and from the destination peer using Moab's data staging capabilities. Another common scenario is that several users present on the source peer are mapped to a single user on the destination peer. In this case, each user on the source peer will need to create keys and set them up with the user at the destination peer. Below are steps that can be used to setup SSH keys among two (or more) peers: NOTE: These directions were written using OpenSSH version 3.6 and may not transfer correctly to older versions. Generate SSH Key on Source Peer As the user who will be submitting jobs on the source peer, run the following command: You will be prompted to give an optional key. Just hit return and ignore this or other settings. When finished, this command will create two files id_rsa and id_rsa.pub located inside the user's ~/.ssh/ directory. Copy the Public SSH Key to the Destination Peer Transfer the newly created public key (id_rsa.pub) to the destination peer: Disable Strict SSH Checking on Source Peer (Optional) By appending the following to your ~/.ssh/config file you can disable SSH prompts which ask to add new hosts to the "known hosts file." (These prompts can often cause problems with data staging functionality.) Note that the ${DESTPEERHOST} should be the name of the host machine running the destination peer: Configure Destination Peer User Now, log in to the destination peer as the destination user and set up the newly created public key to be trusted: If multiple source users map to a single destination user, then add repeat the above commands for each of the source user's SSH public keys. Confige SSH Daemon on Destination Peer Some configuration of the SSH daemon may be required on the destination peer. Typically, this is done by editing the /etc/ssh/sshd_config file. To verify correct configuration, see that the following attributes are set (not commented): If configuration changes were required, the SSH daemon will need to be restarted: Validate Correct SSH Configuration If all is properly configured, if you issue the following command source peer it should succeed without requiring a password: 17.13.4 Peer-to-Peer SCP Data Staging SetupAfter SSH key authentication is setup between users on the source and destination peers, Moab can then be configured to utilize SCP-based data staging. A single configuration file in the $TOOLSDIR ($PREFIX/tools) directory must be modified to properly enable data staging: config.dstage.pl. You will want to modify the $removeExec, $remoteCopy, and $dataSpaceUser parameters to match your system's requirements. Next, follow the below example to create a storage resource manager using other helper scripts included in Moab 4.5.0 or higher. After making these changes, restart Moab for them to take effect:
17.13.5 Other Peer-to-Peer Data Staging ExamplesBelow are other examples of how different data staging methods can be configured for different destination peers. Note that copies of the existing scripts have been modified so that they read different config.dstage.pl files--one for SCP and one for GridFTP. Example: Two Destination Peers with SCP Server and One with GridFTP
As seen in the two examples above, data staging management involves very site specific configuration. Moab's data staging capabilities provide the flexibility to cater to almost any particular need. Sample storage manager interface scripts are provided with Moab Workload Manager and may be customized as needed to support other protocols or methods. For more information about these interfaces, refer to Interface Scripts for a Storage Resource Manager.
|
|||
| © 2001-2008 Cluster Resources, Incorporated | |||