[torqueusers] MOM communication problem
Thomas Vojta
vojtat at umr.edu
Wed Aug 11 13:41:46 MDT 2004
Hi all,
I have encountered a problem with the communication
between pbs_server and pbs_mom (I guess). I has been
discussed here before, but none of the suggestions seems
to work.
- my nodes are never detected and in pbsnodes -a
they are all marked as "state-unknown,down"
- the logs of pbs_mom only shows lines like
08/11/2004 13:34:53;0001; pbs_mom;Svr;pbs_mom;im_eof, End of File from
addr 192.168.0.254:15001
08/11/2004 13:43:33;0001; pbs_mom;Svr;pbs_mom;im_eof, Premature end of
message from addr 192.168.0.254:15001
08/11/2004 14:16:03;0001; pbs_mom;Svr;pbs_mom;im_eof, Premature end of
message from addr 192.168.0.254:15001
or
08/11/2004 14:22:54;0001; pbs_mom;Svr;pbs_mom;im_eof, End of File from
addr 192.168.0.254:15001
08/11/2004 14:23:24;0001; pbs_mom;Svr;pbs_mom;im_eof, End of File from
addr 192.168.0.254:15001
08/11/2004 14:23:54;0001; pbs_mom;Svr;pbs_mom;im_eof, End of File from
addr 192.168.0.254:15001
I use TORQUE-1.1.0p0 on a 64-node cluster connected via private Gigabit
network;
the server has 2 NICs. I tried the various suggestions made on this board
(moving the server entry to first position in the pbs_mom config file,
using only the host names of the internal network for pbs_server and in the
config files.)
Any other suggestions? Has there been an "official solution" to this issue?
Thanks a lot
Thomas
------------------------------------------------------------------
Thomas Vojta phone: 573-341-4793
Assistant Professor fax: 573-341-4715
Department of Physics
University of Missouri-Rolla mailto:vojtat at umr.edu
1870 Miner Circle mailto:thomas at vojtanet.com
Rolla, MO 65409 http://www.umr.edu/~vojtat
More information about the torqueusers
mailing list