[torquedev] [Bug 217] New: Multiple job array dependencies are not honored
bugzilla-daemon at supercluster.org
bugzilla-daemon at supercluster.org
Thu Sep 20 14:32:29 MDT 2012
http://www.clusterresources.com/bugzilla/show_bug.cgi?id=217
Summary: Multiple job array dependencies are not honored
Product: TORQUE
Version: 3.0.x
Platform: All
OS/Version: Linux
Status: NEW
Severity: major
Priority: P5
Component: pbs_server
AssignedTo: dbeer at adaptivecomputing.com
ReportedBy: tasbu at aol.com
CC: torquedev at supercluster.org
Estimated Hours: 0.0
I've boiled down a significant job dependency array bug to a simple test
script.
The problem is that job array dependencies are not honored when you have
more than one array dependency.
The script below creates 2 small sleep jobs of different lengths and submits
each as an array. A final wrap-up job is submitted as depending upon both
of the job arrays successfully finishing. However, the dependent job does not
wait for both arrays to exit successfully before starting, rather,
it starts early after the first array job finishes:
sleep30_array sleep15_array
| |
| |
| -
| done
| dep job starts
| *bug*
|
---
done
dep job should start here
--------SCRIPT_START------
cat << EOF > sleeper15
sleep 15
EOF
cat << EOF > sleeper30
sleep 30
EOF
JOB1=`cat sleeper30 | qsub -t 1-2`
JOB2=`cat sleeper15 | qsub -t 1-2`
echo "cat sleeper15 | qsub -W depend=afterokarray:${JOB1}:${JOB2}"
cat sleeper15 | qsub -W depend=afterokarray:${JOB1}:${JOB2}
--------SCRIPT_END------
Running the script above will give the something like the following:
> sh SCRIPT
cat sleeper15 | qsub -W depend=afterokarray:55349[].madrid:55350[].madrid
55351.madrid
-- after 20 seconds --
> qstat
55349[].madrid STDIN tasbu 0 R batch
55350[].madrid STDIN tasbu 0 C batch
55351.madrid STDIN tasbu 0 R batch
55351.madrid should not be running!
Torque version: 3.0.5
A solution to this problem is crucial to our pipeline and I would
appreciate any fixes / workarounds.
Thanks -
Tom
--
Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the torquedev
mailing list