[torqueusers] Torque HA in a virtual environment
prakash.velayutham at cchmc.org
Sat Dec 13 07:07:32 MST 2008
I posted this yesterday, but for some reason, it got attached to a
different thread. So it is again.
Has anyone here tested Torque with "--ha" in a VM (VMware based)
I tried the following:
2 VM Torque nodes running OpenSUSE 10.3, Torque-2.3.5
PBS Mom systems (physical hosts, not VMs) running Torque-2.3.5.
In this case, everything seems to run ok, until I submit a bulk of
jobs, and then I start getting errors like
pbs_iff: cannot read reply from pbs_server
Cannot connect to specified server host 'bmiclustersvc2-int'.
qsub: cannot connect to server bmiclustersvc2-int (errno=111)
Anyone seen this before? Any ideas what could be going wrong?
Thanks in advance,
More information about the torqueusers