[torqueusers] qsub -VI only working from pbs_server

Philippe Weill philippe.Weill at aero.jussieu.fr
Wed Apr 15 22:37:53 MDT 2009


Hi
I'm having problem with qsub -VI
It's only working from pbs_server ( but pbs_server isn't open to user )

[weill at pbsciclad ~]$ qsub -VI
qsub: waiting for job 65783.pbsciclad.ipslnet to start
qsub: job 65783.pbsciclad.ipslnet ready

Disk quotas for user weill :
      Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
           /home 1250288  5000000 5100000           16547       0       0
[weill at ciclad2 ~]$


I'm using torque 2.3.6 with maui 3.2.6p21 on centos 5.2
Linux ciclad2.ipslnet 2.6.18-53.1.14.el5_lustre.1.6.5.1smp #1 SMP Wed Jun 18 19:45:15 EDT 2008 
x86_64 x86_64 x86_64 GNU/Linux


when i try from other node

[weill at ciclad1 ~]$ qsub -VI
qsub: waiting for job 65784.pbsciclad.ipslnet to start
qsub: job 65784.pbsciclad.ipslnet apparently deleted

And i'm having this message on the node allocated by torque :

         ciclad2 pbs_mom: Connection refused (111) in TMomFinalizeChild, cannot open interactive 
qsub socket to host pbsciclad.ipslnet :50164 - 'cannot bind to port 1023 in client_to_svr - 
connection refused' - check routing tables/multi-homed host issues

I'm not on multihomed nodes and pbs_server is on the same subnet
compute node and headnode use bonding in 802.3ad mode

[root at ciclad2 ~]# ip route
172.19.176.128/25 dev bond0  proto kernel  scope link  src 172.19.176.252
default via 172.19.176.254 dev bond0

there is no iptables on any nodes and selinux is disabled everywhere

ssh is working between every host ( hostbased )

if somebody as an idea

Thanks in advance
-- 
  Weill Philippe -  Administrateur Systeme et Reseaux
  CNRS/UPMC/IPSL   LATMOS (UMR 8190)


More information about the torqueusers mailing list