[torquedev] RMFailure: cannot send the job to pbs_mom
jayavant patil
jayavant.patil82 at gmail.com
Thu Dec 9 22:40:20 MST 2010
hi,
I have a cluster of 8 nodes in which 2 nodes are standalone machines and
other 6 are virtual machines on these 2 standalone machines.The problem is
pbs_server is unable to send the jobs to these 6 virtual machines.
* momctl -h <virtual-machine_name> -d3* command shows error -query[0]
input/output error when i tried to diagnose the pbs_moms form pbs_server
node.In pbs_server log file it is showing that RMFailure:execution server
rejected request; can't send the job. In pbs_mom log file it is showing
connection to pbs_server timeout.But pbsnodes -a is correctly reporting the
states of the pbs_mom nodes.
How to resolve this problem ?
Regards,
Jayavant N. Patil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torquedev/attachments/20101210/960bda50/attachment.html
More information about the torquedev
mailing list