[torqueusers] Torque and Reserved Ports
Jason Williams
jasonw at jhu.edu
Mon Jan 5 13:02:10 MST 2009
Hello All,
I've spent some time googling around for an answer to this, and not
really found one. I have, however, found of several people complaining
of the same issue. The problem I am having is that my pbs_server
machine seems to be running out of available reserved ports (ports <
1024). I've actually traced the issue to what looks like outgoing
communications to all my pbs_mom instances on my compute nodes. It
seems that the server is using a reserved port on the local side of the
connection, and then, for some reason, the connection drops into
TIME_WAIT and sits there when I examine netstat. The cluster has about
120 nodes on it, so the reserved ports can fill up quite fast causing
all automounted NFS mounts to basically die.
I've searched this list's archives with the search function on the
mailing list page and didn't really come up with anything. So I am
wondering if anyone else has seen this and has a possible solution? Any
suggestions are welcome as it's causing my users some significant
amounts of grief.
I'm also kind of curious to know if any one happens to know why what
looks like an out going connection is using a reserved port on the local
side. That strikes me as a bit odd, but I'm sure there's a good reason
for it.
Thanks
--
Jason Williams
Systems Administrator
Johns Hopkins University
Physics and Astronomy Dept.
More information about the torqueusers
mailing list