[torqueusers] Multi-homed execution hosts

Garrick Staples garrick at clusterresources.com
Fri Apr 13 11:18:30 MDT 2007


On Fri, Apr 13, 2007 at 09:19:33AM +0100, Gavin Rees alleged:
> Hello All
> 
> I'm a little confused with the set-up of torque on execution hosts
> which are multi-homed. I am using torque-2.1.2 and trying to set up a
> dual network cluster consisting of a server with 10 attached
> nodes. All 11 machines are attached to the same switch, but added to
> this, the last 5 execution hosts are attached to a separate network
> hidden from the pbs server and the first five nodes. The goal is to
> separate the jobs' inter-node communication from the pbs and nfs
> communication, on the last 5 nodes.
> 
> To put it another way, nfs/pbs communication should be done via the IP
> addresses 192.168.0.*, and the jobs' inter-node communication on the
> last 5 nodes should be done via the IP addresses 10.0.0.*.  The server
> (and the first 5 nodes) have no knowledge of the 10.0.0.* network.
> 
> Currently, all communication is done via the 192.168.0.* network and
> all attempts to add the second network have failed. Changing the
> /etc/hosts file on the nodes to point the the execution hosts 10.0.0.*
> network gives the error:
> 
> pbs_mom;Svr;im_request;connect from 10.0.0.6:1023
> 04/12/2007 12:16:28;0001;   pbs_mom;Svr;pbs_mom;im_request, bad connect from 10.0.0.6:1023 - unauthorized (okclients: 192.168.0.6,192.168.0.5,192.168.0.4,192.168.0.3,192.168.0.2,192.168.0.1,192.168.0.10,192.168.0.9,192.168.0.8,192.168.0.99,192.168.0.7,127.0.0.1)
> 
> My question is, how do I set up torque on the nodes so that pbs and job
> communication are done via separate networks?  
> 
> Thanks in advance for any help you can provide.

I can't think of a way to do what you are trying.  I don't believe it is
possible at this point.



More information about the torqueusers mailing list