[torqueusers] Dependencies being ignored from some submit hosts.

Garrick Staples garrick at usc.edu
Wed Feb 20 15:29:42 MST 2008


On Wed, Feb 20, 2008 at 03:11:46PM -0700, John Hanks alleged:
> Hello,
> 
> I have a test setup, torque 2.2.1 and moab 5.2.1 running on a host,
> call it hostA and a submit host called submitA which only has teh
> torque clients (qsub, qstat, etc.).  I can successfully sumbint jobs
> from sumbitA to hostA with qsub, but get odd behavior when using -W
> depend=afterany:JOBID. For example
> 
> as a user on hostA I can do
> 
> $ qsub job.sh
> hostA.165
> $ qsub -W depend=afterany:165 job.sh
> hostA.166
> 
> Then look at job 166 with checkjob and see it correctly handles the dependency:
> 
> NOTE:  job cannot run  (job has hold in place)
> NOTE:  job cannot run  (dependency 165 jobsuccessfulcomplete not met)
> BLOCK MSG: non-idle state 'Hold' (recorded at last scheduling iteration)
> 
> however, if I do the same thing from submitA
> 
> $ qsub job.sh
> hostA.167
> $ qsub -W depend=afterany:167 job.sh
> hostA.168
> 
> Then look at the job with checkjob it says:
> 
> NOTE:  job cannot run  (job has hold in place)
> BLOCK MSG: non-idle state 'Hold' (recorded at last scheduling iteration)
> 
> and treats this as a hold, so that the job never runs until I do a
> manual releasehold to release the hold.
> 
> I have server_name on both hostA and submitA set to point to hostA and
> torque has
> 
> set server submit_hosts = submitA
> 
> in it's configuration. What do I need to do to have dependencies
> handled correctly from any submit host?

'checkjob' is a maui program and doesn't really say what is going on within torque.

Does 'qstat -f' show that the deps are correctly set up within torque?

-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20080220/5cd9dda9/attachment-0001.bin


More information about the torqueusers mailing list