[torqueusers] limit the number of jobs a user can submit

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Mon Oct 3 05:08:28 MDT 2011

> -----Original Message-----
> From: Martin Siegert [mailto:siegert at sfu.ca]
> Sent: Saturday, 1 October 2011 4:44 AM
> To: Torque Users Mailing List
> Subject: [torqueusers] limit the number of jobs a user can submit
> Hi,
> I know this has been discussed before, but I believe an important
> aspect has been overlooked:
> Moab has a limit on the number of jobs it can handle: the MAXJOB
> parameter:
> "Specifies the maximum number of simultaneous jobs which can be
> evaluated
> by the scheduler. If additional jobs are submitted to the resource
> manager,
> Moab will ignore these jobs until previously submitted jobs complete."
> This allows for a trivial denial-of-service attack:
> Simply submit a job array with at least MAXJOB+1 elements.
> After that moab will disregard all further jobs for scheduling
> even if they have a much higher priority than the array job elements.
> I have not yet found a way of preventing this DoS attack.
> The most logical solution to me would be to expand the
> "max_user_queuable"
> specification to allow for a server wide setting, not just a per
> queue setting, i.e.,
> set server max_user_queuable = 1000
> Is that a feasible solution?
> (and, yes, I'd like this limit to be in torque and not in moab because
> the user will get an immediate response from qsub).
> Cheers,
> Martin
> --
> Martin Siegert
> Simon Fraser University

Hi Martin,

We were bitten by this last week (for the first time ever that we know of) and increased MAXJOB.  I think using a combination of routing queues and execution queues with max_user_queueable should work. That way a user can only deny service to themself.  This solution is advocated here: http://www.clusterresources.com/pipermail/torqueusers/2007-August/005922.html A recent query has more detail but unfortunately was unanswered: http://www.clusterresources.com/pipermail/torqueusers/2011-September/013339.html I'd like to try this setup but don't want the dependency problems.

Perhaps it can work for you if you increase MAXJOB (say 40k) and set max_user_queueable modestly (say 1000).  With those numbers you wouldn't get problems until 40 users submitted 1000 jobs each assuming they don't use multiple queues.



Note. If there were to be a server-wide max_user_queueable, that would imply that jobs could not go into routing queues as well as execution queues. 

More information about the torqueusers mailing list