[torqueusers] Job dependency problem
Troy Baer
tbaer at utk.edu
Thu Jul 9 15:52:54 MDT 2009
Hello all,
A summer intern and I have been working on a tool to automate generating
graph-based job dependency chains (a la condor_submit_dag), and we've
run into an interesting problem where setting a dependency sometimes
doesn't seem to have any effect:
-----
$ qstat -r
verne.nics.utk.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
2879.verne.nics. troy analysis postproc_hr6 13058 1 -- -- 01:00 R --
2880.verne.nics. troy hpss archive 14980 -- -- -- 01:00 R --
$ qstat -f 2879 2880
Job Id: 2879.verne.nics.utk.edu
...
depend = afterok:2874.verne.nics.utk.edu at verne.nics.utk.edu,
beforeok:2880.verne.nics.utk.edu at verne.nics.utk.edu
...
submit_args = -N postproc_hr6 -W depend=afterok:2874.verne.nics.utk.edu -v
day=20090531,hr=0600 postproc.pbs
...
Job Id: 2880.verne.nics.utk.edu
...
depend = afterok:2875.verne.nics.utk.edu at verne.nics.utk.edu:2876.verne.nic
s.utk.edu at verne.nics.utk.edu:2877.verne.nics.utk.edu at verne.nics.utk.ed
u:2878.verne.nics.utk.edu at verne.nics.utk.edu
...
submit_args = -N archive -W depend=afterok:2875.verne.nics.utk.edu:2876.ve
rne.nics.utk.edu:2877.verne.nics.utk.edu:2878.verne.nics.utk.edu:2879.
verne.nics.utk.edu -v day=20090531 archive.pbs
...
-----
Jobid 2880 was submitted with an afterok dependency on 2879 (among other
things), but that has somehow been translated into 2879 having a
beforeok dependency on 2880 that wasn't in the original submission and
doesn't seem to have any effect.
Any ideas on what might be causing this?
Thanks,
--Troy
--
Troy Baer, HPC System Administrator
National Institute for Computational Sciences, University of Tennessee
http://www.nics.tennessee.edu/
Phone: 865-241-4233
More information about the torqueusers
mailing list