1.2 Basic Configuration
By default, make install installs all files in /usr/local/bin, /usr/local/lib, /usr/local/sbin, /usr/local/include, and /usr/local/man . You can specify an installation prefix other than /usr/local using --prefix as an argument to ./configure, for example:
Verify you have environment variables configured so your system can find the shared libraries and binary files for TORQUE.
To set the library path, add the directory where the TORQUE libraries will be installed. For example, if your TORQUE libraries are installed in /opt/torque/lib, execute the following:
> set LD_LIBRARY_PATH=$(LD_LIBRARY_PATH):/opt/torque/lib > ldconfig
$TORQUEHOME/server_priv/ contains configuration and other information needed for pbs_server. One of the files in this directory is serverdb. serverdb contains configuration parameters for pbs_server and its queues. In order for pbs_server to run, serverdb has to be initialized.
serverdb can be initialized in two ways:
Restart pbs_server after initializing serverdb.
> qterm > pbs_server
220.127.116.11 pbs_server -t create
The '-t create' option tells pbs_server to create the serverdb file and initialize it with a minimum configuration to run pbs_server. To see the configuration, use qmgr:
> pbs_server -t create > qmgr -c 'p s' # # Set server attributes. # set server acl_hosts = kmn set server log_events = 511 set server mail_from = adm set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6
A single queue named 'batch' and a few needed server attribues are created.
The torque.setup script uses pbs_server -t create to initialize serverdb, and then adds a user as a manager and operator of TORQUE and other commonly used attributes. The syntax is:
> ./torque.setup ken > qmgr -c 'p s' # # Create queues and set their attributes. # # # Create and define queue batch # create queue batch set queue batch queue_type = Execution set queue batch resources_default.nodes = 1 set queue batch resources_default.walltime = 01:00:00 set queue batch enabled = True set queue batch started = True # # Set server attributes. # set server scheduling = True set server acl_hosts = kmn set server managers = ken@kmn set server operators = ken@kmn set server default_queue = batch set server log_events = 511 set server mail_from = adm set server scheduler_iteration = 600 set server node_check_rate = 150 set server tcp_timeout = 6 set server mom_job_sync = True set server keep_completed = 300
The environment variable $TORQUEHOME is where configuration files are stored. For TORQUE 2.1 and later, $TORQUEHOME is /var/spool/torque/. For earlier versions, $TORQUEHOME is /usr/spool/PBS/.
The pbs_server needs to know which systems on the network are its compute nodes. Each node must be specified on a line in the server's nodes file. This file is located at $TORQUEHOME/server_priv/nodes. In most cases, it is sufficient to specify just the names of the nodes on individual lines; however, various properties can be applied to each node.Syntax of nodes file:
node-name[:ts] [np=] [gpus=] [properties]
The [:ts] option marks the node as timeshared. Timeshared nodes are listed by the server in the node status report, but the server does not allocate jobs to them.
The [np=] option specifies the number of virtual processors for a given node. The value can be less than, equal to, or greater than the number of physical processors on any given node.
The [gpus=] option specifies the number of GPUs for a given node. The value can be less than, equal to, or greater than the number of physical GPUs on any given node.
The node processor count can be automatically detected by the TORQUE server if auto_node_np is set to TRUE. This can be set using the command qmgr -c "set server auto_node_np = True". Setting auto_node_np to TRUE overwrites the value of np set in $TORQUEHOME/server_priv/nodes.
The [properties] option allows you to specify arbitrary strings to identify the node. Property strings are alphanumeric characters only and must begin with an alphabetic character.
Comment lines are allowed in the nodes file if the first non-white space character is the pound sign (#).
The example below shows a possible node file listing.$TORQUEHOME/server_priv/nodes:
# Nodes 001 and 003-005 are cluster nodes # node001 np=2 cluster01 rackNumber22 # # node002 will be replaced soon node002:ts waitingToBeReplaced # node002 will be replaced soon # node003 np=4 cluster01 rackNumber24 node004 cluster01 rackNumber25 node005 np=2 cluster01 rackNumber26 RAM16GB node006 node007 np=2 node008:ts np=4 ...
If using TORQUE self extracting packages with default compute node configuration, no additional steps are required and you can skip this section.
If installing manually, or advanced compute node configuration is needed, edit the $TORQUEHOME/mom_priv/config file on each node. The recommended settings are below.$TORQUEHOME/mom_priv/config:
$pbsserver headnode # note: hostname running pbs_server $logevent 255 # bitmap of which events to log
This file is identical for all compute nodes and can be created on the head node and distributed in parallel to all systems.
After serverdb and the server_priv/nodes file are configured, and MOM has a minimal configuration, restart the pbs_server on the server node and the pbs_mom on the compute nodes.Compute Nodes:
> qterm -t quick > pbs_server
After waiting several seconds, the pbsnodes -a command should list all nodes in state free.
|© 2001-2010 Adaptive Computing Enterprises, Inc.|