[torqueusers] Job array throttling
glen.beane at gmail.com
Thu Jul 10 08:19:27 MDT 2008
On Thu, Jul 10, 2008 at 10:06 AM, Gabe Turner <gabe at msi.umn.edu> wrote:
> We starting to play with job arrays as we have some actual requests for
> them from former SGE users which we've recently adopted. Submitting and
> delting them is working fine (in 2.3.0), but there seems to be no way that
> I can tell to throttle them. Basically, I would like to limit the size of
> any given job array to a static number of jobs. We have a policy, as
> unwise and politically-motivated as it may be, dictating that no user can
> submit more than 10 jobs. Currently we throttle this using a submit
> What our submit filter does is just parse qstat output to determine the
> number of jobs a user has submitted (crude, but it works) and it returns an
> error when they try to submit jobs beyond 10. The problem is that a user
> would be able to circumvent this by submitting a job array containing more
> than 10 jobs, as it seems the submit filter is only executed once, even
> when sumitting a job array. Fortunately, once this 10+ job array has been
> submitted, our filter then prevent further submissions for that user.
> We can work support for detecting '#PBS -t ...' into our submit filter, but
> I don't believe that will work if -t is passed on the qsub command line.
> I've considered the implementation of a qsub wrapper, but frankly it's just
> too easy to circumvent.
> Anyone want to brainstorm with me about this? Any plans by those Torque
> developers working on job arrays to providing throttling policies for job
> array size?
> Any help would be greatly appreciated!
I can easilly add a max_array_size qmgr parameter that would cause
pbs_server to reject a submission for a job array larger than that. You
would have to wait until 2.4.0 is stable and ready, since we don't want
anyting but bug fixes to go into the 2.3 branch from now on.
I think a submit filter *should* also have access to anything passed on the
command line (but that doesn't mean that is what the current implementation
does, I would consider that a bug if it does not).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers