[torqueusers] PBS: X11 forwarding init failed from start_exec.. solved

Eva Hocks hocks at sdsc.edu
Tue Feb 25 13:48:37 MST 2014



The issue is ipv6 (Socket family 10). On our system it is disabled in
sysctl but the ipv6 stack is still present. In order to make the
disabled ipv6 work with pbs_mom it needs the /etc/modprobe.d/ipv6.conf
"options ipv6 disable=1" and the sysctl.conf:
  net.ipv6.conf.all.disable_ipv6 = 1
  net.ipv6.conf.default.disable_ipv6 = 1
and a reboot.

$ qsub -I -X

x11_create_display: Socket family 2 is supported
x11_create_display: Socket family 10 *NOT* supported
listening on fd 4
successful x11 init, returning display 50
entering port_forwarder


-Eva


On Mon, 24 Feb 2014, Eva Hocks wrote:

>
>
> Found a peculiar problem when using X11 forwarding via qsub:
>
> all nodes work off a new install, some nodes after some run time show
> the error "PBS: X11 forwarding init failed"  and pbs job start does not
> set X11 forwarding (display variable is not set).
>
> netstat does not show any of those ports allocated and a reboot of the
> system does NOT solve it!
>
>
> A working connection looks like:
>
> $ qsub -I -X -l nodes=hpc-0-6
> qsub: waiting for job 359.hpcdev-005.local to start
> qsub: job 359.hpcdev-005.local ready
>
> x11_create_display: Socket family 2 is supported
> x11_create_display: Socket family 10 is supported
> listening on fd 3
> successful x11 init, returning display 50
> entering port_forwarder
>
> and the port 6050 is allocated
> tcp        0      0 127.0.0.1:6010              0.0.0.0:*                   LISTEN
> tcp        0      0 127.0.0.1:6050              0.0.0.0:*        LISTEN
>
> A non working connection:
>
> $ qsub -X -I -l nodes=tscc-1-61 -q condo
> qsub: waiting for job 1177312.tscc-mgr.local to start
> qsub: job 1177312.tscc-mgr.local ready
>
>
> x11_create_display: Socket family 2 is supported
> x11_create_display: Socket family 10 is supported
> bind port 6050: Cannot assign requested address
> x11_create_display: Socket family 2 is supported
> x11_create_display: Socket family 10 is supported
> bind port 6051: Cannot assign requested address
> ........
> ........
> x11_create_display: Socket family 2 is supported
> x11_create_display: Socket family 10 is supported
> bind port 6498: Cannot assign requested address
> x11_create_display: Socket family 2 is supported
> x11_create_display: Socket family 10 is supported
> bind port 6499: Cannot assign requested address
> Failed to allocate internet-domain X11 display socket.
> PBS: X11 forwarding init failed
>
> and the port is not allocated on tose systems.
> tcp        0      0 localhost.lo:x11-ssh-offset *:*                         LISTEN
>
>
>
>
> Any hint?
> Thanks
> Eva
>
>
>
>
>



More information about the torqueusers mailing list