[torqueusers] epilogue script runs twice

Axel Kohlmeyer akohlmey at cmm.chem.upenn.edu
Tue Feb 2 21:49:27 MST 2010

On Tue, Feb 2, 2010 at 10:45 PM, Jeremy Enos <jenos at ncsa.uiuc.edu> wrote:
> I too have been extraordinarily aggravated by this inconsistent behavior.
> How can this not be a bug?  There are any number of reasons multiple
> epilogue calls can cause problems.  If there are any legitimate reasons that
> multiple epilogues should be called, then those instances should create the
> workaround- not the other way around.
> In my case, I not only make database entries within epilogue but also do
> operations on hardware devices (GPUs) that fail if run over the top of one
> another- this ends up causing a cascading failure when it occurs (as it
> should).  I need a way to prevent multiple epilogue scripts from running, or
> a bug fix.  Can there be consensus that this is a bug, or am I missing
> something?  (perfectly possible)

how about using a .pid lockfile? before you do a "sensitive" operation,
create a .pid lock file in which you echo the value of $$, then do the
critical stuff, and delete it. then you wrap this code into a function that
tests whether the same pid still exists if the file already exists. if not
ignore/delete the lockfile, if yes. wait with sleep 1 until the file is gone
or some large number has passed.

as was said before, the epilogue script should assume nothing
about what has been done before and whether it has completed,
but rather make sure that the system gets back to a defined state.

