Moab Cluster Builder™ Installation Procedures
The following information is intended to instruct the new user on how to set up a cluster using Moab Cluster BuilderTM. Some basic networking knowledge is a prerequisite, but nothing beyond "average user capabilities" is needed to perform a standard cluster install.
A small foreword on tested/supported cluster architectures is required before we can go in-depth and document the install procedure.
The network structure, or topology, in a cluster is probably the most important design decision. There are two supported and tested options: (1) single network and (2) dual network interface head node.
This is the simplest network layout you can plan for a cluster. The head node has only one network interface and all the compute nodes are on the same network.
As the head node serves DHCP on the network, you must disable DHCP on all routers and switches.
Dual Network Interface Head Node
In this setup, the head node has two network interfaces and is attached to two separate switched networks: (1) an open, or publicly available, network and (2) a closed network.
Users connect to a publicly available network through the head node. Thus, you do not need to disable DHCP on this network.
A closed network is one on which all the compute nodes are located. On a closed network, the head node acts as a DNS/DHCP server, and it also serves as a gateway. Thus, you need to know which interface is connected to the compute network.
The Head Node Install
Configure the head node to boot from the DVD drive. Then, insert the SLES HPC DVD and boot from it. An install selection screen opens. If your monitor does not support the default resolution, you can press F3 to change the graphic resolution.
- Use the arrow keys to select Installation on the SLES Boot Options menu and press Enter.
- On the Language screen, specify your language preference and click Next.
- On the License Agreement screen, if you agree to the license agreement terms, select the Yes radio button and click Next.
- On the Installation Mode screen, just click Next to accept the default configuration.
- Specify the appropriate region and time zone in the respective Region and Time Zone fields and ensure the time is correct. Then, click Next.
- Review your installation options and packages and click Accept. The installer will prepare a basic system with your options and then reboot the machine.
- After you click Install on the Confirm Installation window, the machine will reboot.
- Ensure that the machine boots from the hard drive this time, or use the arrow keys to select Boot from Hard Disk on the SLES Boot Options menu and press Enter. (Reboot is from the hard drive by default; if you do nothing and allow the countdown to reach zero, the machine will reboot automatically.)
- Type a password in the Password for root User field. Then, confirm the password and click Next.
- Clear the Change Hostname via DHCP box. Also, type a hostname in the Hostname field—headnode, for example. Then, type a domain name in the Domain Name field—cluster, for example.
- On the Network Configuration screen, click Network Interfaces to open the Network Card Configuration Overview screen.
- By default, the installer will assume that interfaces are using DHCP; you may need to manually change this setting. Select the network card that will communicate with the compute nodes and click Edit.
- Select the Static Address Setup radio button. Then, specify an IP address.
- Click Next to return to the Network Card Configuration Overview screen. Then, click Next to return to the Network Configuration screen.
- Click Next again.
- The installer will proceed to configure your network settings.
- A test will be performed to check that your external connection settings are correct. Optionally, test the Internet connection. It is okay to skip the test if you prefer. If you do test the Network Interface, be certain that the interface you are testing is the interface that is connected to the outside network and not the interface that you configured for the internal cluster. Specify your preference and click Next.
- If you run the test, the Test Result field indicates connection status. When connection is successful, click Next.
- If you skip the test, review settings and click Next.
If (and only if) you run the Internet connection test and it is successful, then the Novell Customer Center Configuration screen appears:
If the preceding screen displays, select Configure Later and click Next.
- For cluster-wide users to work, select LDAP and click Next.
- All LDAP settings are automatically configured. Review the settings and then click Next.
- You must add at least one user to the LDAP database. SSH keys will be created on the whole cluster automatically. After typing required information, click Next.
- You will now be presented with the SUSE Linux Enterprise Server 10 Release Notes. Click Next when you are ready.
- Review the hardware configuration for the machine. Ensure the graphics settings are correct and click Next.
- The SUSE install is now complete. Click Finish to boot the head node.
Status bars indicate package installation progress.
Running the Moab Cluster Builder Deployment Wizard
After the machine has finished booting, log in to the head node as the root user (using the user name root) and the password that you specified during installation. The Moab Cluster Builder Deployment Wizard starts automatically.
If you log on as any other user than root, Moab Cluster Builder will not run. You must log in as root.
The Required Tab
- Note that the initial setup screen of the Moab Cluster Builder Deployment Wizard is the Required tab. From the drop-down list at the top of the screen, select the network card the head node will use to communicate with compute nodes. If there is only one interface correctly configured, it is selected by default.
- In the next field, specify the number of racks in the cluster.
- For each rack specified, there are two corresponding fields indicating default numbers of compute nodes (31) and non-compute nodes (1). Click the compute and non-compute fields to change the defaults to the appropriate number of nodes per respective rack; type the appropriate number for each field. Then, click Next.
You must select the correct network card or the installation process will fail.
The Advanced Tab
The Advanced section offers options that go beyond the standard cluster configuration. Advanced options appear on the left side of the screen and include Network, Security, Layout, Contact, Packages, and Components. When you click the Advanced tab, the Network screen appears by default.
Check the box if you want to have the head node act as a routing device. This will be the case if you have chosen to use dual interfaces on the head node.
On the Security screen, you can configure the default root password for the compute nodes and a list of administrator users. If you click the plus (+) button, you can add administrator users; clicking the minus (-) button allows you to remove administrator users.
Entering an empty value for the root password will make the compute nodes adopt the same password as the head node.
To access the layout editor, you must have completed all the required sections.
The layout editor allows the administrator to review and modify the actual setup of the cluster. DNS names, IP assignments, MAC addresses, rack positions and node profiles can be modified within this screen. You can also import an xCat compatible MAC address list.
To ease management, the hostname for a node is made of two separate parts: a node prefix (default: "node") and a specific name (default: [a sequential number]). The final hostname is formed by concatenating these two values (e.g. "node32" will be composed of "node" prefix and "32" specific name). The node prefix can be empty.
If you have an xCat compatible MAC address list, click Import MAC addresses and supply its path. The MAC address list must be an ASCII file containing the intended node specific name and the MAC address, one couple per line, in any order.
The Contact screen allows you to submit the contact information for email alerts from monitoring tools (e.g., Nagios).
On the Packages screen, you can modify the "compute" profile via the AutoYaST editor after the head node configuration is completed.
Do not delete/change the included script (setupcluster.pl) or change the PXE and GPG options or the install might fail.
On the Components screen, you can select to replace MPICH with other MPI implementations. If you choose to do this, you will be required to install the software manually before you run the wizard, and you will have to install and configure it manually on the compute nodes after the install.
After all the configuration options have been selected, the wizard will set up the head node accordingly. After this final step is completed, the head node is ready to manage the new cluster.
Installing Compute Nodes
After the head node configuration is completed, you must start the node booting process. At the beginning of this procedure (unless you have specified MAC addresses for every node in the advanced configuration panel), all nodes must be off, including those non-compute nodes that might request a DHCP lease (e.g., managed switches).
The wizard will prompt you to boot all nodes in a specific order to allow Moab Cluster Builder to store configuration information and assure correct OS install on those nodes that require it. (Booting the machine intended to be node2 before booting the machine intended to be node1 will cause the node numbers and IP assignments to be transposed, and thus display the nodes in incorrect order on the Visual Cluster within Moab Cluster Manager.)
If you used the MAC address layout model on the Advanced tab, you can boot up nodes in any order as quickly as desired. Of course, booting a large number of nodes simultaneously may cause network congestion.
Viewing the Install Screen on the Remote Node with VNC
To connect to an installing node and view the graphical SLES installation, use VNC from the head node. In a terminal:
- Type vncviewer <nodename>:1
- When prompted for a password, type install
Moab Cluster Builder will run a set of predetermined tests on the newly installed hardware, and it will try to run some test jobs. You will be notified of any errors while this procedure takes place.
When the procedure terminates correctly, you will be congratulated about your newly working cluster, and the wizard will terminate.
If It Fails