[torqueusers] ha-torque using DRBD

Stewart.Samuels at sanofi-aventis.com Stewart.Samuels at sanofi-aventis.com
Tue Oct 14 14:00:21 MDT 2008

If you have a NAS device, you can install the TORQUE software tree on
that then NFS mount this tree onto both masters rather than use DRBD.  I
read a recent thread from one of the torque users that they actually got
this working, but again, I haven't had the time recently to work on this


From: Daniel Bourque [mailto:dbourque at weatherdata.com] 
Sent: Tuesday, October 14, 2008 3:08 PM
To: Samuels, Stewart R&D/US
Cc: prakash.velayutham at cchmc.org; torqueusers at supercluster.org
Subject: Re: [torqueusers] ha-torque using DRBD

thanks Prakash


That's how my nodes are setup. 

[root at ictc01n05 torque]# cat server_name

Right now, I just have a cronjob that runs every 5 minutes to rsync
/var/spool/torque/server_priv , and failover is done cold. Obviously
this has a high risk of have inconsistent torque data when a failover
occurs... Which is why I want to throw in DRBD in the mix.

Daniel Bourque
Sr. Systems Engineer
WeatherData Service Inc
An Accuweather Company

Stewart.Samuels at sanofi-aventis.com wrote: 

	Daniel, Prakash, 
	This may indeed work now with the HA version of TORQUE.  I
haven't had the time or opportunity to work on it lately but have the
same idea as you Daniel.  The problem with TORQUE in tha past is as
described by Prakash.  But now that TORQUE, using the --HA option of
2.3.0+ versions, allows you to have multiple entries in the
"server_name" file, TORQUE may no longer suffer from the same issues. 

	From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Prakash
	Sent: Tuesday, October 14, 2008 1:45 PM
	To: Daniel Bourque
	Cc: torqueusers at supercluster.org
	Subject: Re: [torqueusers] ha-torque using DRBD

	I tried this some months ago for Torque, but failed. Looks like,
as mentioned by Stewart Samuels in an earlier thread,  TORQUE uses the
gethostbyname routine to identify the host and therefore, when
failingover, you eventually must have the failover node be identified as
the original system. Which is not the ideal thing to do in a heartbeat


	On Oct 14, 2008, at 1:35 PM, Daniel Bourque wrote:

		  I don't have shared storage available on my 2
headnodes. I would like to put /var/spool/torque/server_priv on a drbd
volume, and use heartbeat to stop/start torque during failover.
		  Moab runs on both headnodes in a HA config, since it
does not require shared storage.
		  Is anyone using DRBD & heartbeat in such a way ? How
well does it work ?
		Daniel Bourque
		Sr. Systems Engineer
		WeatherData Service Inc
		An Accuweather Company
		torqueusers mailing list
		torqueusers at supercluster.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20081014/5fefcf10/attachment.html

More information about the torqueusers mailing list