[torqueusers] pbs_mom connections to 127.0.0.1:15001
siegert at sfu.ca
Thu Jun 28 12:01:31 MDT 2012
not a dumb question at all ...
I was absoultely sure that I have $pbsserver and server_name setup
correctly, but I checked mom_priv/config nevertheless and, yes,
the $pbsserver line is correct. But there was something else:
This line must have been there ever since we started using torque
and was probably put there just because we copied the initial
configuration from somebody else. Anyway, I did not even know
what $clienthost means. Thus I checked the torque-4.0.1 documentation
and voila $clienthost is marked as depreciated - use $pbsserver instead.
Apparently in earlier versions specifying $clienthost AND $pbsserver
did not matter much, but in 4.0.x is does.
I just removed all $clienthost from all computenodes and the error
messages are gone.
On Thu, Jun 28, 2012 at 09:21:17AM -0600, David Beer wrote:
> Dumb question, but have you already checked the server_name file, as
> well as the mom's config file?
> On Wed, Jun 27, 2012 at 7:38 PM, Martin Siegert <siegert at sfu.ca>
> with torque-4.0.2 (not with 2.5.11) I see a huge number of log
> (every 45s) in /var/log/messages on all computenodes:
> Jun 27 18:31:15 b414 pbs_mom: LOG_ERROR::Connection refused (111) in
> tcp_connect_sockaddr, Failed when trying to open tcp connection -
> connect() failed [rc = 15096] [addr = 127.0.0.1:15001]
> Jun 27 18:31:15 b414 pbs_mom: LOG_ERROR::mom_server_update_stat,
> Cannot get a valid stream to send update to server 'localhost'
> Why would the mom try to contact a server 'localhost'?
> How can I get rid of this?
> Martin Siegert
> Simon Fraser University
> Burnaby, British Columbia
> torqueusers mailing list
> torqueusers at supercluster.org
> David Beer | Software Engineer
> Adaptive Computing
More information about the torqueusers