[torqueusers] Questions about pbs_server --ha
vgregorio at penguincomputing.com
Fri Apr 10 14:54:56 MDT 2009
Hey folks :)
I've been lurking about for a bit and finally had a question to post.
So, I am using two systems with pbs_server --ha and a shared NFS mount
for /var/spool/torque/server_priv. In my testing, I bring down the
primary server by pulling the power plug. Unfortunately, the secondary
server does not pick up and become the primary pbs_server.
Is this because /var/spool/torque/server_priv/server.lock is not removed
when the primary server has a critical failure?
So, I tried removing the server.lock file, but the secondary pbs_server
--ha instance never picks up and becomes primary. What is the trigger
to activate a passive pbs_server --ha?
Any advice is appreciated.
More information about the torqueusers