[torqueusers] Torque-HA resource manager integration

Michael Sternberg sternberg at anl.gov
Fri Jul 25 09:38:54 MDT 2008


How do I tell Moab or Maui how to schedule for an HA-Torque server pair?


I use the following versions:
	torque-2.3.2-snap.200807092141
	moab server version 5.2.1 (revision 9490)


Here's where I am:  I've set up a torque-HA pair with server_priv/  
shared on an HA-NFS mount, following:

	http://www.clusterresources.com/wiki/doku.php?id=torque:4.3_server_high_availability


Both pbs_server processes run, both have the shared lock file open,  
but only one at a time has TCP ports open, as it should:

	lsof -p `ps -ef | awk '/[p]bs_server/ {print $2}'`

Killing either server makes the other reliably take over the ports and  
provide service.  On a client, "qstat -a" shows the currently active  
server in the header, but  always with s01 in the job spec, as  
designed.  So far, so good.


Now, what do I have to do on the moab/maui side?  I am thinking along  
the lines of using the same or different names in "RMCFG[name]", and  
likewise, for EPORT.  Is this the right place to look?  I've tried in  
moab.cfg:


(1) Same RM names, same port:

	RMCFG[baseha]	TYPE=PBS HOST=s01 EPORT=15008
	RMCFG[baseha]	TYPE=PBS HOST=s02 EPORT=15008

Fails - no scheduling; every 30s in moab/stats/events.* :

	09:36:55 1216996615 rm       baseha       RMDOWN       cannot connect  
to RM

I take it the second line overrides the first, and s02 happened to be  
the standby at the time, so no go.


(2) One entry only, joined host names:

	RMCFG[baseha]	TYPE=PBS HOST=s01,s02 EPORT=15008

Fails - no scheduling; RMDOWN events every 30s.

	09:39:59 1216996799 rm       baseha       RMDOWN       cannot connect  
to RM

OK, I guess that's a syntax error then.  Same with "+" and ":" as a  
separator.  (Inspired by lustre.)


(3) Same RM name, different ports:

	RMCFG[baseha]	TYPE=PBS HOST=s01 EPORT=15008
	RMCFG[baseha]	TYPE=PBS HOST=s02 EPORT=15009

Scheduling works only when s02 is active.  Does not work when s01  
takes over:

	09:50:34 1216997434 rm       baseha       RMDOWN       cannot connect  
to RM

Again, probably means the last definition for "RMCFG[]" wins.


(4) *Different* RM names, same port:

	RMCFG[baseha1]	TYPE=PBS HOST=s01 EPORT=15008
	RMCFG[baseha2]	TYPE=PBS HOST=s02 EPORT=15008

Scheduling works when either s01 or s02 is active.  However, the  
standby RM is always reported as down.

	09:54:00 1216997640 rm       baseha2     RMDOWN       cannot connect  
to RM
	...
	09:59:59 1216997999 rm       baseha1	 RMDOWN       cannot connect to RM


So, number (4) seems to work, but:

(a) Is is safe?

(b) Is it robust (e.g. during RM server failovers)?

(c) The "RMDOWN" events for the standby RM will drown out critical  
failures, such as for notification:

	http://www.clusterresources.com/products/mwm/moabdocs/14.4eventmgmt.shtml

     Can I squelch exactly the log entries relating to the standby  
server?

     I looked at the following docs, but found nothing specific:

	http://www.clusterresources.com/products/mwm/moabdocs/14.2logging.shtml#eventformat
	http://www.clusterresources.com/products/mwm/moabdocs/a.fparameters.shtml#recordeventlist


I also have Linux-HA working on the torque-HA node pair, and could  
provide a shared IP for the scheduler to talk to.  However, as Linux- 
HA and pbs_server use different time constants and mechanisms to  
trigger failover, this can only lead to a mess when the service  
locations are incoherent.


Regards, Michael


More information about the torqueusers mailing list