[torquedev] Race conditions in IM_ protocol.
"Mgr. Šimon Tóth"
SimonT at mail.muni.cz
Thu Jun 10 06:58:09 MDT 2010
As I have diverged from the upstream a lot I'm not sure if this hasn't
been actually fixed, but I have found race conditions in the IM_ protocol.
Specifically, when IM_JOIN fails due to one of the prologs returning
non-zero value, this is what happens:
- sister: reports system error and purges the job
- master: exec_bail is run, sending IM_ABORT to all sisters
- master: exec_bail sets job into EXITING substate
- master: scan_for_exiting sends obit to server
- master: callback for the obit sets the job substate into OBIT
- sister: receives IM_ABORT, doesn't find the job (already purged)
- sister: reports error
- master: receives error for IM_ABORT and switches the job into EXITING
- everything: fails
Mgr. Šimon Tóth
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100610/a6f0179b/attachment.bin
More information about the torquedev