[torquedev] Mom cannot read data from qsub socket

"Mgr. Šimon Tóth" SimonT at mail.muni.cz
Wed Sep 15 09:49:36 MDT 2010


>> Any ideas? I'm really lost and confused.
> 
> I'm still confused, but longer lost :-D
> 
> The error is caused by a timeout on pbs_mom.
> 
> * pbs_mom is waiting for data from qsub
>   * qsub does not send data because he is waiting for server
>     * server does not respond to qsub because he is still talking
>       to the scheduler who run the job in the first place :)

Phew, finally got the exact picture of what is happening. Filled a bug
report. Qsub hangs because the send_job process still holds the open socket.

-- 
Mgr. Šimon Tóth

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100915/162c5d7b/attachment-0001.bin 


More information about the torquedev mailing list