[torqueusers] Torque-1.2.0p5 Server to mom communication error
Clifton Kirby
ckirby3 at colsa.com
Thu Sep 1 12:36:32 MDT 2005
I never got any response to this post so I thought I would post it again.
Does anyone else use the --disable-rpp option on larger clusters? I didn't
see this problem until I added this option but it was recommended for larger
clusters and ours is over 3000 processors. Thanks in advance..
----------------------------------------------------------------------------
-------------------------------------
Running on Mac OS x 10.4.2 using Myrinet.
I used gcc 4.0 to compile torque-1.2.0p5 and the configure line I used is
as follows,
/configure --prefix=/opt/torque --with-scp --enable-server --set-sched=c --
enable-docs --enable-mom --enable-clients --enable-syslog --set-server-home=
/private/var/spool/torque --set-default-server=mach5c.mach5.roc --disable-fi
lesync --disable-gui --disable-rpp
The following messages are being logged in the mom_logs,
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------
08/19/2005 13:40:25;0001; pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
rm_request, bad attempt to connect - unauthorized (port: 59791)
message refused from port 59791 addr 172.16.21.254
08/19/2005 13:44:25;0001; pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
rm_request, bad attempt to connect - unauthorized (port: 62920)
message refused from port 62920 addr 172.16.21.254
08/19/2005 13:45:25;0001; pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
rm_request, bad attempt to connect - unauthorized (port: 63449)
message refused from port 63449 addr 172.16.21.254
08/19/2005 13:46:25;0001; pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
rm_request, bad attempt to connect - unauthorized (port: 63978)
message refused from port 63978 addr 172.16.21.254
08/19/2005 13:47:25;0001; pbs_mom;Svr;pbs_mom;Unknown error: 0 (0) in
rm_request, bad attempt to connect - unauthorized (port: 64507)
message refused from port 64507 addr 172.16.21.254
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------
Seems like mom to server communication is being attempted on a range of
ports outside the standard 15001-15004. Should I reserve a range of ports
in /etc/services?
- Cliff
More information about the torqueusers
mailing list