[torqueusers] PBS_SERVER Error with cluster service

Jerry Smith jdsmit at sandia.gov
Fri Feb 9 10:10:00 MST 2007



Hi there,


We have Torque 2.1.6 running, on 2 liux machines that work as a failover
cluster.

The problem is when we start the PBS_SERVER its starts ok on the cluster
virtual service "cluservice" with the ip 192.168.1.1, but on a slave node
when we start the pbs_mom it connects to "cluservice" but who answers is the
local ip adress of the server and local name not the cluster name.

Because of that the nodes are down when we pbsnodes -a.


any ideas ?



Torque is very picky about the order in which addresses are setup in
/etc/hosts.

How is the internal and external addresses setup for your node? Ie... Which
is first in /etc/hosts?



Jerry


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20070209/18b501bb/attachment.html


More information about the torqueusers mailing list