[torqueusers] Re: More pbs_mom communication problems

Chris Samuel csamuel at vpac.org
Tue Mar 1 19:41:28 MST 2005


On Tue, 1 Mar 2005 09:28 pm, Hannu Väisänen wrote:

> It was a stupid firewall error. Probably. I have
>
> -A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
>
> but that was not the last rule in /sys/config/iptables

Aha - that would do it. Remember that rules are checked first to last in a 
table and the first matching one wins (unless you do some tricks).

This is why I like Shorewall, I find it simplifies that sort of thing.

> Now, When I do on the node
>
> telnet server 15001
> Trying nnn.nnn.nnn.nnn...
> Connected to server.
> Escape character is '^]'.

Cool, so layer 3 works now, so onto the rest.

> So I think now there is a route to server.
>
> Everything seemed OK for a while. pbsnodes -a listed both nodes as
> free. (I have two nodes, one on the same machine than the server and
> the other on another machine.)
>
> Then this appeared on the other machine's mom log
>
> pbs_mom;Svr;pbs_mom;No child processes (10) in is_update_stat, cannot
> specify protocol pbs_mom;Svr;pbs_mom;im_eof, Premature end of message from
> addr 193.167.41.152:15001
>
> It comes every 8 or 9 minutes.
>
> now pbsnodes -a again lists the node as down.

Hmm, which version of Torque is this ?

> > Doesn't sound to me like a Torque/PBS issue at all.
>
> It is probably a firewall/networks issue.
>
> So, where can I find Idiot's Guide to Firewalls and Networks? (-:

Networking:
http://www.lesbell.com.au/Home.nsf/0/ac6983ac71a95aebca256f3800170b4f?OpenDocument

Firewalls:
http://www.informit.com/articles/article.asp?p=359423

Google is your friend. :-)

> Thanks, Cris, and everybody else helping me!

Not a problem, we've all been there at some point..

cheers,
Chris
-- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050302/1245a77e/attachment.bin


More information about the torqueusers mailing list