[torquedev] Mom cannot read data from qsub socket
"Mgr. Šimon Tóth"
SimonT at mail.muni.cz
Wed Sep 15 09:49:36 MDT 2010
>> Any ideas? I'm really lost and confused.
>
> I'm still confused, but longer lost :-D
>
> The error is caused by a timeout on pbs_mom.
>
> * pbs_mom is waiting for data from qsub
> * qsub does not send data because he is waiting for server
> * server does not respond to qsub because he is still talking
> to the scheduler who run the job in the first place :)
Phew, finally got the exact picture of what is happening. Filled a bug
report. Qsub hangs because the send_job process still holds the open socket.
--
Mgr. Šimon Tóth
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100915/162c5d7b/attachment-0001.bin
More information about the torquedev
mailing list