[torquedev] Race conditions in IM_ protocol.

"Mgr. Šimon Tóth" SimonT at mail.muni.cz
Thu Jun 10 06:58:09 MDT 2010


As I have diverged from the upstream a lot I'm not sure if this hasn't
been actually fixed, but I have found race conditions in the IM_ protocol.

Specifically, when IM_JOIN fails due to one of the prologs returning
non-zero value, this is what happens:

- sister: reports system error and purges the job
- master: exec_bail is run, sending IM_ABORT to all sisters
- master: exec_bail sets job into EXITING substate
- master: scan_for_exiting sends obit to server
- master: callback for the obit sets the job substate into OBIT
- sister: receives IM_ABORT, doesn't find the job (already purged)
- sister: reports error
- master: receives error for IM_ABORT and switches the job into EXITING
substate
- everything: fails

-- 
Mgr. Šimon Tóth

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100610/a6f0179b/attachment.bin 


More information about the torquedev mailing list