[torquedev] problem with RM interface when RPP disabled

Lennart Karlsson Lennart.Karlsson at nsc.liu.se
Fri Dec 9 00:30:09 MST 2005


Garrick,

You wrote:
> It seems that RM client programs (basicly, just momctl), aren't cleaning
> up their local priv ports when TORQUE is built with --disable-rpp.
> 
> Try to use momctl a 1000 times and it will start failing when
> you run out of priv ports:
> 
> $ for a in `seq 1 1000`;do  momctl -d 0 -h hpcjr0004 >/dev/null;done
> cannot connect to MOM on node 'hpcjr0004', errno=99 (Cannot assign requested address)
> cannot connect to MOM on node 'hpcjr0004', errno=99 (Cannot assign requested address)
> cannot connect to MOM on node 'hpcjr0004', errno=99 (Cannot assign requested address)
> 
> This works fine when using RPP.  Can anyone else duplicate this?


I am not using rpp and with version 1.2.0p6-snap.1125811484 I get:

# for a in `seq 1 1000`;do  /usr/pbs/sbin/momctl -d 0 -h n2 >/dev/null;done
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
ERROR:    query[0] 'diag0' failed on n2 (errno: 98:98)
[root at moonwatch maui]#

This can be repeated if I wait a few minutes, with few (10) ERROR answers
like this. But if I run a second such for loop at once after the first one,
I get exactly 1000 ERROR answers.

-- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
   National Supercomputer Centre in Linkoping, Sweden
   http://www.nsc.liu.se




More information about the torquedev mailing list