[torqueusers] ha-torque using DRBD
Daniel Bourque
dbourque at weatherdata.com
Tue Oct 14 13:07:52 MDT 2008
thanks Prakash
Stewart,
That's how my nodes are setup.
[root at ictc01n05 torque]# cat server_name
ictc01n01,ictc01n02
Right now, I just have a cronjob that runs every 5 minutes to rsync
/var/spool/torque/server_priv , and failover is done cold. Obviously
this has a high risk of have inconsistent torque data when a failover
occurs... Which is why I want to throw in DRBD in the mix.
Daniel Bourque
Sr. Systems Engineer
WeatherData Service Inc
An Accuweather Company
Stewart.Samuels at sanofi-aventis.com wrote:
> Daniel, Prakash,
>
> This may indeed work now with the HA version of TORQUE. I haven't had
> the time or opportunity to work on it lately but have the same idea as
> you Daniel. The problem with TORQUE in tha past is as described by
> Prakash. But now that TORQUE, using the --HA option of 2.3.0+
> versions, allows you to have multiple entries in the "server_name"
> file, TORQUE may no longer suffer from the same issues.
>
> Stewart
>
> *From:* torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org] *On Behalf Of *Prakash
> Velayutham
> *Sent:* Tuesday, October 14, 2008 1:45 PM
> *To:* Daniel Bourque
> *Cc:* torqueusers at supercluster.org
> *Subject:* Re: [torqueusers] ha-torque using DRBD
>
> Hi,
>
> I tried this some months ago for Torque, but failed. Looks like, as
> mentioned by Stewart Samuels in an earlier thread, TORQUE uses the
> gethostbyname routine to identify the host and therefore, when
> failingover, you eventually must have the failover node be identified
> as the original system. Which is not the ideal thing to do in a
> heartbeat environment.
>
> Prakash
>
> On Oct 14, 2008, at 1:35 PM, Daniel Bourque wrote:
>
>> Hi,
>>
>> I don't have shared storage available on my 2 headnodes. I would
>> like to put /var/spool/torque/server_priv on a drbd volume, and use
>> heartbeat to stop/start torque during failover.
>>
>> Moab runs on both headnodes in a HA config, since it does not
>> require shared storage.
>>
>> Is anyone using DRBD & heartbeat in such a way ? How well does it
>> work ?
>>
>> Thanks
>>
>> --
>> Daniel Bourque
>> Sr. Systems Engineer
>> WeatherData Service Inc
>> An Accuweather Company
>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20081014/9dcdf0d6/attachment.html
More information about the torqueusers
mailing list