[torqueusers] nodes unable to contact to server
Manoj Kumar Singh
manoks at cat.ernet.in
Thu Oct 28 01:03:11 MDT 2004
Dear All
When i tried to cheack about node status through momctl command:
[root at node1 root]# momctl -d 2
Host: node1.cluster.lmd.cat.ernet.in/node1.cluster.lmd.cat.ernet.in
Server: brahma Version: torque_1.1.0p4
HomeDirectory: /var/spool/PBS/mom_priv
MOM active: 8253 seconds
WARNING: no messages received from server
Last Msg To Server: 14 seconds
WARNING: no hello/cluster-addrs messages received from server
Init Msgs Sent: 288 hellos
LOGLEVEL: 0 (use SIGUSR1/SIGUSR2 to adjust)
JobList: NONE
diagnostics complete
>From above message, it seem that pbs_mom of nodes does not contact to
server, though server brahma is accessable from all nodes. Also server
brahma is running pbs_server, pbs_mom and pbs_scheduler.
Please help
Thank
I am
Manoj
Manoj Kumar Singh said:
>
> Daer All
>
> I have encounter a problem, when i tried to get information for the mom on
> node1, node2 .....
>
> $momctl -h node1,node2 -d 2
> then its print following error
> simpleget: End of File
> ERROR: query[0] 'diag' failed on node1 (errno: 0:5)
> startcom: diswsi error Protocol failure in commit.
>
> How to solve this problem?
> Any help
>
> Regards
>
> Manoj
>
> M. K. Singh
> Centre for Advanced Tech
> India
>
>
More information about the torqueusers
mailing list