[torqueusers] stream_eof error message

Yang Wang yang.wang at agencourt.com
Mon Aug 4 14:43:42 MDT 2008

Hello all,


I just added a new nodes (of 8 cpu) to our existing cluster. After restart the pbs_server/pbs_mom/maui, the job submitted to this new nodes will be execute.

When the "pbsnodes -a", the node seems OK.


#pbsnodes -a 



     state = free

     np = 7

     ntype = cluster

     status = opsys=linux,uname=Linux bio206.agencourt.com 2.6.9-67.ELsmp #1 SMP Wed Nov 7 13:56:44 EST 2007 x86_64,sessions=419,nsessions=1,nusers=1,idletime=4830,totmem=34921672kb,availmem=34531120kb,physmem=32890064kb,ncpus=8,loadave=0.00,netload=159574755634,state=free,jobs=,varattr=,rectime=1217882344


Here is the related part of the server log file:


08/04/2008 15:31:50;0040;PBS_Server;Req;ping_nodes;successful ping to node bio206.agencourt.com (stream 6)


08/04/2008 15:31:54;0001;PBS_Server;Svr;PBS_Server;stream_eof, connection to bio206.agencourt.com is bad, remo

te service may be down, message may be corrupt, or connection may have been dropped remotely (Premature end of

 message).  setting node state to down




What else I need to do to bring this nodes to a useable stage.



Thanks for the help.




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080804/53fff47f/attachment.html

More information about the torqueusers mailing list