Bug 217 - Multiple job array dependencies are not honored
: Multiple job array dependencies are not honored
Status: NEW
Product: TORQUE
: 3.0.x
: All Linux
: P5 major
Assigned To: David Beer
  Show dependency treegraph
Reported: 2012-09-20 14:32 MDT by Tom
Modified: 2012-09-20 14:32 MDT (History)
1 user (show)

See Also:



You need to log in before you can comment on or make changes to this bug.

Description Tom 2012-09-20 14:32:28 MDT
I've boiled down a significant job dependency array bug to a simple test
The problem is that job array dependencies are not honored when you have
more than one array dependency.

The script below creates 2 small sleep jobs of different lengths and submits
each as an array. A final wrap-up job is submitted as depending upon both
of the job arrays successfully finishing. However, the dependent job does not
wait for both arrays to exit successfully before starting, rather,
it starts early after the first array job finishes:

   sleep30_array           sleep15_array
        |                       |
        |                       |
        |                       -
        |                     done
        |                dep job starts
        |                    *bug*
dep job should start here

cat << EOF > sleeper15
sleep 15
cat << EOF > sleeper30
sleep 30

JOB1=`cat sleeper30 | qsub -t 1-2`
JOB2=`cat sleeper15 | qsub -t 1-2`
echo "cat sleeper15 | qsub -W depend=afterokarray:${JOB1}:${JOB2}"
cat sleeper15 | qsub -W depend=afterokarray:${JOB1}:${JOB2}

Running the script above will give the something like the following:

cat sleeper15 | qsub -W depend=afterokarray:55349[].madrid:55350[].madrid

-- after 20 seconds --

> qstat
55349[].madrid             STDIN            tasbu                  0 R batch
55350[].madrid             STDIN            tasbu                  0 C batch
55351.madrid               STDIN            tasbu                  0 R batch

55351.madrid should not be running!

Torque version: 3.0.5

A solution to this problem is crucial to our pipeline and I would
appreciate any fixes / workarounds.

Thanks -