[torqueusers] Intermittent pbs_server connection problems upon upgrading
Nate Coraor
nate at psu.edu
Mon Jul 26 08:43:22 MDT 2010
Hi all,
I've recently upgraded from 2.1.11 to 2.4.8 and since doing so, have
been experiencing a lot of delays in communication with pbs_server.
qstat often takes a bit (~5-10 seconds) to respond, and sometimes
doesn't at all (it looks like, if the response time is > 10 seconds),
failing with this error:
pbs_iff: cannot connect to torque.example.org:15001 - timeout, errno=146
(Connection refused) cannot connect to port 1022 in client_to_svr -
connection refused
No Permission.
qstat: cannot connect to server torque.example.org (errno=15007)
Unauthorized Request
Subsequent invocations of qstat succeed. When this error is logged,
nothing interesting is happening in pbs_server, even if running with
loglevel 7, and the connection attempt is not logged at all.
I haven't completely ruled out connection problems, but at the very
least, packets aren't dropping or taking long to move between the submit
host and the server.
Is there an obvious place to start?
Thanks,
--nate
More information about the torqueusers
mailing list