|
|||
4.3 Server High AvailabilityThe option of running TORQUE in a redundant or high availability mode has been implemented. This means that there can be multiple instances of the server running and waiting to take over processing in the event that the currently running server fails.
Multiple server host machinesTwo server host machines can be running pbs_server at the same time. The two servers have their torque/server_priv directory mounted on a shared NFS file system. The pbs_server need to be started with the --ha command line option which will allow two servers to be running at the same time. Only the first server to start will complete the full startup. The second server to start will block very early in the startup when it tries to lock the file torque/server_priv/server.lock. When the second server cannot obtain the lock, it will spin in a loop and wait for the lock to clear. The sleep time between checks of the lock file is one second. Notice that not only can the servers be running on independent server hardware, there can also be multiple instances of the pbs_server running on the same machine. This was not possible before as the second one to start would always write an error and quit when it could not obtain the lock. How commands select the correct server hostThe various commands that send messages to pbs_server usually have an option of specifying the server name on the command line or if none is specified, will use the default server name. The default server name comes either from the environment variable PBS_DEFAULT or from the file torque/server_name. The definition of the contents of the file torque/server_name has been extended to allow this specification to be a comma separated list of server names. When a command is executed and no explicit server is mentioned, an attempt will be made to connect to the first server name in the list. If this fails, then the second server name will be tried. If both servers are unreachable, an error is returned and the command fails. Note that there is a period of time after the failure of the current server during which the new server is starting up where it is unable to process commands. The new server must read the existing configuration and job infromation from the disk and so the length of this time is based on the size of the disk based state information. Commands issued during this period of time might fail due to timeouts expiring. Job namesOne aspect of this enhancement is in the construction of job names. Job names normally contain the name of the host machine where pbs_server is running. Now when job names are constructed, only the first name from the server specification list is used in building the job name. Persistence of the pbs_server processThe system adminstrator must ensure that pbs_server continues to run on the server nodes. This could be as simple as a cron job that counts the number of pbs_server's in the process table and starts some more if needed. High availability of the NFS serverOne consideration of this implemention is that it depends on NFS file system also being redundant. NFS can be set up as a redundant service. See the following.
There are also other ways to set up a shared file system. See the following. Example configurationThe following section will describe the test setup used to verify the operation of this new feature. Three machines were my desktop machine, jakaa, and two lab machines, node12 and node13, where the pbs_server's were resident. Commands were submitted from my desktop machine, jakaa. This machine also ran pbs_mom and pbs_sched. It was also the NFS server for the shared server_priv directory. The NFS setup on jakaa is show below. This allows the entire torque directory structure to be shared although the server machines only shared the server_priv part of the share. Next is shown the fstab file that describes how the mounts were done on node12 and node13. Nodes 12 and 13 each had their own /var/spool/torque directories. The NFS mount just replaced the server_priv subdirectory with the shared one on jakaa. As far as the local configuration one nodes 12 and 13, the only thing done was to setup the server_name file. The following shows the setup of the batch queue. Note that in this setup, we had to explicitly add node12 and node13 to the acl_hosts lists. This requirement was removed by having the code automatically add the names in the server_name file to the acl_hosts list.Next are shown the processes running on all of the machines.
|
|||
| © 2001-2008 Cluster Resources, Incorporated | |||