[torqueusers] canot connect to port 1023 connection refused

Brock Palen brockp at umich.edu
Fri Apr 9 10:40:37 MDT 2010


We are using privileged ports,  I did some digging on the mailing list  
and found some issues that had had in the past with NFS connections  
failing on the head node.  I did some tweaking of backing off  
job_stat_rate and counting the number of ports in use <1024 and found  
we could be running out of them.

I will be trying some of the setting described in this message:
http://www.clusterresources.com/pipermail/torqueusers/2009-February/008715.html

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985



On Apr 9, 2010, at 12:00 PM, David Beer wrote:

> If you're using privileged ports - which it looks like you are from  
> the error message - then each connection will be on a port < 1024,  
> so you're limited to 1024 concurrent connections minus whatever  
> other ports under 1024 are being used by other programs.
>
> David
>
> ----- "Brock Palen" <brockp at umich.edu> wrote:
>
>> We started seeing a lot of messages like:
>>
>> 04/08/2010 00:00:00;0001;   pbs_mom;Svr;pbs_mom;LOG_ERROR::Operation
>>
>> now in progress (115) in post_epilogue, cannot connect to port 1023  
>> in
>>
>> client_to_svr - connection refused
>> 04/08/2010 00:00:00;0001;   pbs_mom;Svr;pbs_mom;LOG_ERROR::Operation
>>
>> now in progress (115) in post_epilogue, cannot connect to port 1023  
>> in
>>
>> client_to_svr - connection refused
>>
>> In the mom logs, jobs still start and end fine, but we are seeing  
>> some
>>
>> strange pbs_server pbs_mom sync issues at times.
>>
>> Any idea what these errors mean?
>>
>> netstat -a   on the server shows:
>>
>> udp        0      0 *:1023                      *:*
>>
>> is there a limit on how fast connections can be made?  We have been
>> adding a number of nodes recently, should we make setting changes on
>>
>> our server?
>>
>> Thanks!
>>
>> Brock Palen
>> www.umich.edu/~brockp
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
> -- 
> David Beer | Senior Software Engineer
> Adaptive Computing
>
>
>



More information about the torqueusers mailing list