[torqueusers] Torque-1.2.0p5 Server to mom communication error

Garrick Staples garrick at usc.edu
Thu Sep 1 13:03:08 MDT 2005


On Thu, Sep 01, 2005 at 01:36:32PM -0500, Clifton Kirby alleged:
> I never got any response to this post so I thought I would post it again.
> Does anyone else use the --disable-rpp option on larger clusters?  I didn't
> see this problem until I added this option but it was recommended for larger
> clusters and  ours is over 3000 processors.  Thanks in advance..

I don't.  RPP works fine for me with 1700 nodes.  But I know disabling
RPP is popular on other larger clusters, so feel free to ignore me :)

 
> 08/19/2005 13:40:25;0001;   pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
> rm_request, bad attempt to connect - unauthorized (port: 59791)
>         message refused from port 59791 addr 172.16.21.254
> 08/19/2005 13:44:25;0001;   pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
> rm_request, bad attempt to connect - unauthorized (port: 62920)
>         message refused from port 62920 addr 172.16.21.254
> 08/19/2005 13:45:25;0001;   pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
> rm_request, bad attempt to connect - unauthorized (port: 63449)
>         message refused from port 63449 addr 172.16.21.254
> 08/19/2005 13:46:25;0001;   pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
> rm_request, bad attempt to connect - unauthorized (port: 63978)
>         message refused from port 63978 addr 172.16.21.254
> 08/19/2005 13:47:25;0001;   pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
> rm_request, bad attempt to connect - unauthorized (port: 64507)
>         message refused from port 64507 addr 172.16.21.254
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------------
> ----------------------------------------------------

Are these connections coming from pbs_server or the scheduler?  I'd
probably use ktrace to figure that out.

pbs_server should be sending from priviledged ports (under 1024) and
should succeed assuming 172.16.21.254 is the first $clienthost in your
MOM config.

It is possible that your schedule is connecting from unpriviledged
ports.  If you are using maui, this is an error.

 
> Seems like mom to server communication is being attempted on a range of
> ports outside the standard 15001-15004.  Should I reserve a range of ports
> in /etc/services?

I don't know what it means to "reserve ports in /etc/services."


-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050901/a449a8ec/attachment-0001.bin


More information about the torqueusers mailing list