[torqueusers] ha torque
Daniel Bourque
dbourque at weatherdata.com
Wed Apr 9 12:30:41 MDT 2008
thanks
how much disk space does /var/spool/torque/server_priv typically use ?
how about the maui scheduler ? should it be running on both headnodes,
trying to communicate with localhost ?
I'm a little confused by the example, where the scheduler runs on the
the hosts as pbs_mom and not pbs_server... is the intent to also
failover the scheduler along with the shared file system ?
thanks again.
Daniel Bourque
Sr. Systems Engineer
WeatherData Service Inc
An Accuweather Company
Steve Snelgrove wrote:
> The 2.3 release of Torque has support for HA by allowing two head node
> server to access the server_priv files on a shared file system. See
> http://www.clusterresources.com/torquedocs21/4.3high-availability.shtml
> for more details.
>
>
> Daniel Bourque wrote:
>
>> Hi,
>>
>> We're planning on setting up a torque/Maui cluster. I'm planning
>> on making the head node also be worker nodes, and for a 2nd worker
>> node to be a failover headnode.
>>
>> My intent is to use heartbeat to control the state of torque, Maui
>> and a service IP.
>>
>> Is this possible ?
>>
>> what files need to be kept in sync ?
>>
>> if the headnode fails, what happens to running jobs ?
>>
>> if the headnode fails, when Maui start on the new headnode, will it
>> query the pbs_mom daemons on the worker nodes to get usage info ?
>>
>> Thanks
>>
>
More information about the torqueusers
mailing list