[torqueusers] More pbs_mom communication problems
Hannu Väisänen
hvaisane at joyx.joensuu.fi
Mon Feb 28 04:00:50 MST 2005
On server log I get
PBS_Server;Svr;check_nodes;node xxxxx not detected in 1152 seconds, marking node down
On node log I get
pbs_mom;Svr;pbs_mom;No child processes (10) in is_update_stat, cannot specify protocol
pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from addr nnn.nnn.nnn.nnn:15001
That's the server ================
When I do
telnet server 15001
on the node I get No route to host.
ssh to and from the node works.
On both the server and the node, iptables-save says
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 15001 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 15004 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 15003 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m state --state NEW -m udp --dport 15003 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 15002 -j ACCEPT
pbsnodes -a on the server says the node is down.
Any ideas how to continue?
More information about the torqueusers
mailing list