|
|||
13.10 Intelligent Platform Management Interface
13.10.1 IPMI OverviewThe Intelligent Platform Management Interface (IPMI) specification defines a set of common interfaces system administrators can use to monitor system health and manage the system. The IPMI interface can monitor temperature and other sensor information, query platform status and power-on/power-off compute nodes. As IPMI operates independently of the node's OS interaction with the node can happen even when powered down. Moab can use IPMI to monitor temperature information, check power status, power-up, power-down, and reboot compute nodes. 13.10.2 Node IPMI ConfigurationIPMI must be enabled on each node in the compute cluster. This is usually done either through the node's BIOS or by using a boot CD containing IPMI utilities provided by the manufacturer. With regard to configuring IPMI on the nodes, be sure to enable IPMI-over-LAN and set a common login and password on all the nodes. Additionally, you must set a unique IP address for each node's BMC. Take note of these addresses as you will need them when reviewing the Creating the IPMI BMC-Node Map File section. 13.10.3 Installing IPMItoolIPMItool is an open-source tool used to retrieve sensor information from the IPMI Baseboard Management Controller (BMC) or to send remote chassis power control commands. The IPMItool developer provides Fedora Core binary packages as well as a source tarball on the IPMItool download page. Download and install IPMItool on the Moab head node and make sure the ipmitool binary is in the current shell PATH.Proper IPMI setup and IPMItool configuration can be confirmed by issuing the following command on the Moab head node.
The output of this command should be similar to the following.
13.10.4 Creating the IPMI BMC-Node Map File [OPTIONAL]Since the BMC can be controlled via LAN, it is possible for the BMC to have its own unique IP address. Since this IP address is separate from the IP address of the node, a simple mapping file is required for Moab to know each node's BMC address. The file is a flat text file and should be stored in the Moab home directory. If a mapping file is needed, specify the name in the config.ipmi.pl configuration file in the tools/ directory. The following is an example of the mapping file: Note that only the nodes specified in this file are queried for IPMI information. Also note that the mapping file is disabled by default and the nodes that are returned from Moab with mdiag -n are the ones that are queried for IPMI sensor data. 13.10.5 Configuring the Moab IPMI ToolsThe tools/ subdirectory in the install directory already contains the Perl scripts needed to interface with IPMI. The following is a list of the Perl scripts that should be in the tools/ directory; confirm these are present and executable. Next, a few configuration settings need to be adjusted in the config.ipmi.pl file. The IPMI-over-LAN username and password need to be set to the values that were set in the Node IPMI Configuration section. Also, the IPMI query daemon's polling interval can be modified by adjusting $pollInterval. This specifies how often the IPMI-enabled nodes are queried to retrieve sensor data. 13.10.6 Configuring MoabTo allow Moab to use the IPMI tools, a native resource manager is configured. To do this, the following lines must be added to moab.cfg:Next, the following lines can be added to allow Moab to issue IPMI power commands. Moab can be configured to perform actions based on sensor data. For example, Moab can shut down a compute node if its CPU temperature exceeds 100 degrees Celsius, or it can power down idle compute nodes if workload is low. Generic event thresholds are used to tell Moab to perform certain duties given certain conditions. The following example is of a way for Moab to recognize it should power off a compute node if its CPU0 temperature exceeds 100 degrees Celsius. 13.10.7 Ensuring Proper SetupOnce the preceding steps have been taken, Moab should be started as normal. The IPMI monitoring daemon should start automatically, which can be confirmed with the following: After a few minutes, IPMI data should be retrieved and cached. This can be confirmed with the following command: Finally, issue the following to ensure Moab is grabbing the IPMI data. Temperature data should be present in the Generic Metrics row.
|
|||
| © 2001-2008 Cluster Resources, Incorporated | |||