[torqueusers] procct and held jobs

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Mon Oct 10 04:07:50 MDT 2011


Hi All,

We recently updated torque from 3.0.2 to 3.0.3-snap.201108261653 and have found that at least in some cases, if we submit a job with a hold (with qsub -a to run after a given time) to a routing queue, when the job is released and moves to an execution queue it will still not run because moab 6.0.2 sees a procct GRES. qstat -f shows a procct resource only while the job is held and in the routing queue.

Does anyone else with a recent torque version see this problem.  You can test with:
echo sleep 300 | qsub -a `date -d 'now + 5 minutes' +'%Y%m%d%H%M'`

This should hold for 5 minutes then run and sleep for 5 minutes.

Gareth

For reference, I've worked around the issue by defining in moab a GLOBAL gres called procct with a large count.  The same technique would probably work with maui


More information about the torqueusers mailing list