[torqueusers] Problem with excessive and incorrect "Begun execution" mails

Åke Sandgren ake.sandgren at hpc2n.umu.se
Wed Dec 21 06:48:49 MST 2005


We have been having lots of problems with excessive MAIL_BEGIN mails
being sent to users.

We have a prolog script that verifies that there is enough free space on
a certain filesystem before allowing jobs to actually start.
If there isn't the prolog script does exit 3 to requeue the job.

This has been generating multiple MAIL_BEGIN mails being sent for the
same jobid and annoying users alot since the server sends the MAIL_BEGIN
mail before verifying that the mom has actually started the job.

The attached patch is a first version of remeding this, it delays
sending the MAIL_BEGIN until after having gotten the session id back
from the mom. A quick test on 2.0.0p4 showed that it worked for my
testcase, i.e. in mom prolog if user is me then exit 3 stopped the
excessive MAIL_BEGIN and gave me a correct MAIL_BEGIN when the
if-statement was removed.

Please take a look and comment.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: move_mail_begin.patch
Type: text/x-patch
Size: 1581 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051221/e456c697/move_mail_begin.bin

More information about the torqueusers mailing list