[torqueusers] interactive qsub failure

Gabe Turner gabe at msi.umn.edu
Fri Apr 27 15:55:43 MDT 2012

On Fri, Apr 27, 2012 at 02:28:14PM -0700, Kenneth Yoshimoto wrote:
> I'm seeing an intermittent failure with qsub -I
> The message in /var/log/messages is:
> Apr 27 14:07:27 gcn-17-71 pbs_mom: LOG_ERROR::Operation now in progress (115) in TMomFinalizeChild, cannot open interactive qsub socket to host gordon-ln4.local:50620 - 'cannot connect to port 1023 in client_to_svr - connection refused' - check routing tables/multi-homed host issues
> I think my routing is okay, as I can telnet to the the login node
> port from the compute node.  I also see some packet exchange to
> the port with tcpdump.  Could the mom be attempting the connection
> before qsub starts listening?  I would have thought qsub would
> start listening before sending the job to pbs_server.  Any ideas
> on what might cause this?

In order for an interactive session to work, the compute node needs to make
a connection back to the submission host, so you'll want to make sure that
your firewall rules allow that.

Gabe Turner                                             gabe at msi.umn.edu
HPC Systems Administrator,
University of Minnesota
Supercomputing Institute                          http://www.msi.umn.edu

More information about the torqueusers mailing list