[torqueusers] momctl spurious errors

Arnau Bria arnaubria at pic.es
Wed Jun 2 03:19:53 MDT 2010


Hi all,

I'm facing this error:

# momctl  -h td478.pic.es -d1
ERROR:    query[0] 'diag1' failed on td478.pic.es (errno=0-Success: 5-Input/output error)

time to time.
It does not depend on the WN.

Command works after first attempt:

# momctl  -h td478.pic.es -d1

s/td478.pic.es   Version: 2.4.9-snap.201005191035   PID: 3980
pic.es (193.109.174.37:15001)
ved:     0 hellos/1 cluster-addrs
         1 hellos
erver:   1167 seconds (CLUSTER_ADDRS)
ver:     41 seconds
[...]

an other example:

[root at pbs02 ~]# momctl  -h td045.pic.es -d1
ERROR:    query[0] 'diag1' failed on td045.pic.es (errno=0-Success: 5-Input/output error)
[root at pbs02 ~]# momctl  -h td045.pic.es -d1

Host: td045.pic.es/td045.pic.es   Version: 2.4.9-snap.201005191035   PID: 3756
Server[0]: pbs02.pic.es (193.109.174.37:1023)
  Init Msgs Received:     2 hellos/4 cluster-addrs
  Init Msgs Sent:         2 hellos
  Last Msg From Server:   32 seconds (StatusJob)
  Last Msg To Server:     12 seconds
HomeDirectory:          /var/spool/pbs/mom_priv
[...]


is this behaviour expected?

TIA,
Arnau


More information about the torqueusers mailing list