[torqueusers] pbs_mom :: cannot connect to port 1023 in client_to_svr

Adrian Sevcenco Adrian.Sevcenco at cern.ch
Fri Aug 5 04:16:16 MDT 2011


Hi! on one of my latest nodes (24 cores) i receive this kind of messages:
Aug  5 12:18:01 alien-0-37 pbs_mom: LOG_ERROR::Operation now in progress
(115) in scan_for_exiting, cannot connect to port 1022 in client_to_svr
- connection refused
Aug  5 12:18:05 alien-0-37 pbs_mom: LOG_ERROR::Operation now in progress
(115) in post_epilogue, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  5 12:18:06 alien-0-37 pbs_mom: LOG_ERROR::Operation now in progress
(115) in post_epilogue, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  5 12:18:06 alien-0-37 pbs_mom: LOG_ERROR::Operation now in progress
(115) in post_epilogue, cannot connect to port 1022 in client_to_svr -
connection refused

as far i can tell i have no such messages on the other nodes (8 cores)
but i have it also on my other 24 core node.
i must also tell that the net conection from nodes to server is
bottlenecked as the server is also nfs and gateway for all nodes (their
jobs take the data to be processed from external (to cluster) sources)

Any idea what is going on? is this network bottleneck at fault?
Thank you,
Adrian

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3110 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20110805/dc8653cf/attachment.bin 


More information about the torqueusers mailing list