[torqueusers] stream_eof error message
Yang Wang
yang.wang at agencourt.com
Mon Aug 4 14:43:42 MDT 2008
Hello all,
I just added a new nodes (of 8 cpu) to our existing cluster. After restart the pbs_server/pbs_mom/maui, the job submitted to this new nodes will be execute.
When the "pbsnodes -a", the node seems OK.
#pbsnodes -a
.
bio206.agencourt.com
state = free
np = 7
ntype = cluster
status = opsys=linux,uname=Linux bio206.agencourt.com 2.6.9-67.ELsmp #1 SMP Wed Nov 7 13:56:44 EST 2007 x86_64,sessions=419,nsessions=1,nusers=1,idletime=4830,totmem=34921672kb,availmem=34531120kb,physmem=32890064kb,ncpus=8,loadave=0.00,netload=159574755634,state=free,jobs=,varattr=,rectime=1217882344
Here is the related part of the server log file:
08/04/2008 15:31:50;0040;PBS_Server;Req;ping_nodes;successful ping to node bio206.agencourt.com (stream 6)
08/04/2008 15:31:54;0001;PBS_Server;Svr;PBS_Server;stream_eof, connection to bio206.agencourt.com is bad, remo
te service may be down, message may be corrupt, or connection may have been dropped remotely (Premature end of
message). setting node state to down
What else I need to do to bring this nodes to a useable stage.
Thanks for the help.
Yang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080804/53fff47f/attachment.html
More information about the torqueusers
mailing list