[torqueusers] Torque-HA resource manager integration

Michael Sternberg sternberg at anl.gov
Fri Jul 25 09:38:54 MDT 2008

How do I tell Moab or Maui how to schedule for an HA-Torque server pair?

I use the following versions:
	moab server version 5.2.1 (revision 9490)

Here's where I am:  I've set up a torque-HA pair with server_priv/  
shared on an HA-NFS mount, following:


Both pbs_server processes run, both have the shared lock file open,  
but only one at a time has TCP ports open, as it should:

	lsof -p `ps -ef | awk '/[p]bs_server/ {print $2}'`

Killing either server makes the other reliably take over the ports and  
provide service.  On a client, "qstat -a" shows the currently active  
server in the header, but  always with s01 in the job spec, as  
designed.  So far, so good.

Now, what do I have to do on the moab/maui side?  I am thinking along  
the lines of using the same or different names in "RMCFG[name]", and  
likewise, for EPORT.  Is this the right place to look?  I've tried in  

(1) Same RM names, same port:

	RMCFG[baseha]	TYPE=PBS HOST=s01 EPORT=15008
	RMCFG[baseha]	TYPE=PBS HOST=s02 EPORT=15008

Fails - no scheduling; every 30s in moab/stats/events.* :

	09:36:55 1216996615 rm       baseha       RMDOWN       cannot connect  
to RM

I take it the second line overrides the first, and s02 happened to be  
the standby at the time, so no go.

(2) One entry only, joined host names:

	RMCFG[baseha]	TYPE=PBS HOST=s01,s02 EPORT=15008

Fails - no scheduling; RMDOWN events every 30s.

	09:39:59 1216996799 rm       baseha       RMDOWN       cannot connect  
to RM

OK, I guess that's a syntax error then.  Same with "+" and ":" as a  
separator.  (Inspired by lustre.)

(3) Same RM name, different ports:

	RMCFG[baseha]	TYPE=PBS HOST=s01 EPORT=15008
	RMCFG[baseha]	TYPE=PBS HOST=s02 EPORT=15009

Scheduling works only when s02 is active.  Does not work when s01  
takes over:

	09:50:34 1216997434 rm       baseha       RMDOWN       cannot connect  
to RM

Again, probably means the last definition for "RMCFG[]" wins.

(4) *Different* RM names, same port:

	RMCFG[baseha1]	TYPE=PBS HOST=s01 EPORT=15008
	RMCFG[baseha2]	TYPE=PBS HOST=s02 EPORT=15008

Scheduling works when either s01 or s02 is active.  However, the  
standby RM is always reported as down.

	09:54:00 1216997640 rm       baseha2     RMDOWN       cannot connect  
to RM
	09:59:59 1216997999 rm       baseha1	 RMDOWN       cannot connect to RM

So, number (4) seems to work, but:

(a) Is is safe?

(b) Is it robust (e.g. during RM server failovers)?

(c) The "RMDOWN" events for the standby RM will drown out critical  
failures, such as for notification:


     Can I squelch exactly the log entries relating to the standby  

     I looked at the following docs, but found nothing specific:


I also have Linux-HA working on the torque-HA node pair, and could  
provide a shared IP for the scheduler to talk to.  However, as Linux- 
HA and pbs_server use different time constants and mechanisms to  
trigger failover, this can only lead to a mess when the service  
locations are incoherent.

Regards, Michael

