[torqueusers] nodes unable to contact to server

Manoj Kumar Singh manoks at cat.ernet.in
Thu Oct 28 01:03:11 MDT 2004


Dear All

When i tried to cheack about node status through momctl command:

[root at node1 root]# momctl -d 2

Host: node1.cluster.lmd.cat.ernet.in/node1.cluster.lmd.cat.ernet.in  
Server: brahma   Version: torque_1.1.0p4
HomeDirectory:          /var/spool/PBS/mom_priv
MOM active:             8253 seconds
WARNING:  no messages received from server
Last Msg To Server:     14 seconds
WARNING:  no hello/cluster-addrs messages received from server
Init Msgs Sent:         288 hellos
LOGLEVEL:               0 (use SIGUSR1/SIGUSR2 to adjust)
JobList:                NONE

diagnostics complete


>From above message, it seem that pbs_mom of nodes does not contact to
server, though server brahma is accessable from all nodes. Also server
brahma is running pbs_server, pbs_mom and pbs_scheduler.


Please help

Thank

I am

Manoj

Manoj Kumar Singh said:
>
> Daer All
>
> I have encounter a problem, when i tried to get information for the mom on
> node1, node2 .....
>
> $momctl -h node1,node2 -d 2
> then its print following error
> simpleget: End of File
> ERROR:    query[0] 'diag' failed on node1 (errno: 0:5)
> startcom: diswsi error Protocol failure in commit.
>
> How to solve this problem?
> Any help
>
> Regards
>
> Manoj
>
> M. K. Singh
> Centre for Advanced Tech
> India
>
>






More information about the torqueusers mailing list