[torqueusers] How does qsub determine what interface the user is on when multihoned

Lorin Hochstein lorin at isi.edu
Thu Aug 26 21:24:18 MDT 2010


I 'd like to know how it is that torque decides what network interface a command was issued from (e.g., qsub, qmgr), when running on a multihoned system, and if there's any way to control this. 

I keep running into this problem on my multihoned system (frontend.cluster.isi.edu, an Ubuntu 10.04 machine) that torque gets confused about which interface is being used to issue a command. 

In the past, if torque thought that job submission requests were coming from the "wrong" interface, I just disabled that interface, restarted the server, and then authorized that interface.

However, my latest problem is that torque thinks that commands are being issued from this (virtual) interface:

virbr0    Link encap:Ethernet  HWaddr 36:66:a5:b1:2e:92  
          inet addr:  Bcast:  Mask:
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3899 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:212202 (212.2 KB)

This is used by KVM for doing NAT addressing. Ideally, I'd just turn it off, but for technical reasons I'm not able to do that. 

This made torque very slow in responding to commands like qsub, apparently because torque takes longer to resolve numerical IP addresses. So I added the following line to /etc/hosts: virbr

And now it responds quicker, but I get incorrect job ownership like this, which causes jobs to remain in the queue forever:

$ qstat -f 65
Job Id: 65.frontend.cluster..isi.edu
    Job_Name = test.sh
    Job_Owner = lorin at virbr

I'd like to issue the following commands to qmgr:

set server acl_hosts + virbr
set server managers += root at virbr
set server operators += root at virbr
set server submit_hosts += virbr

Unfortunately, I can't do this because I'm not authorized to connect from virbr:

$ sudo qmgr -c "set server submit_hosts += virbr"
qmgr obj= svr=default: Unauthorized Request 

I've done everything I can to tell torque to use frontend.cluster.isi.edu as the interface, including:

lorin at island100:~$ cat /var/lib/torque/server_name

$ cat /var/lib/torque/torque.cfg

In /etc/init.d/torque-server:
DAEMON_SERVER_OPTS="-H frontend.cluster.isi.edu"

If I can't somehow tell torque to assume that all "qsub" and "qmgr" commands should be considered from "frontend" (or "localhost"), is there any other way I can change the queue permissions to allow access from



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3910 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100826/4ef89d65/attachment.bin 

More information about the torqueusers mailing list