[torqueusers] Re: [Mauiusers] potential of misuse with interactive jobs

Garrick Staples garrick at clusterresources.com
Thu Oct 19 12:59:54 MDT 2006


On Thu, Oct 19, 2006 at 02:14:40PM -0400, Neelesh Arora alleged:
> Garrick Staples wrote:
> >On Wed, Oct 18, 2006 at 04:05:24PM -0400, Neelesh Arora alleged:
> >>Hi All,
> >>
> >>In a setup like ours:
> >>multiple queues with different max_cputime limits, users specify 
> >>required cputime, jobs get routed accordingly
> >>there is a potential of resource hoarding, in that a user may submit an 
> >>interactive job with very high cputime and block a node for future use. 
> >>Since the job does not accrue cputime, it is not effected by the queue's 
> >>max_cputime limit.
> >>
> >>We would like to avoid this scenario and I can think of 2 ways to do it:
> >>- route interactive jobs to a separate queue (with limited max_cputime)
> >>- identify an interactive job at submission and set a very low walltime 
> >>(similar to above)
> >>- disable interactive jobs at the server level
> >>
> >>I am afraid I can't find the way to achieve any of these. Can someone 
> >>please apprise me how this can be done? How do others avoid such misuse?
> >
> >There isn't a way to enforce policy based on interactive jobs.
> >
> >Btw, understand that such a feature would always be trivial to bypass.  
> >Users can run the exact same commands in interactive and batch jobs.  My
> >own users like to do naughty things like this:
> >
> >  echo sleep 999999 | qsub
> >
> >The smarter ones do this:
> >  echo screen -m -D | qsub
> >
> >The best solution to these things is this:
> >  qselect -u <username> | xargs qdel
> >  chsh <username> /bin/false
> 
> While I agree that there will always be users who would try to act 
> smart, but that does not mean that shared systems like Torque should not 
> provide countermeasures. In particular, when it happens to be a misuse 
> of a feature of such a system. If there is a potential of misuse of 
> interactive jobs to bypass all fairshare/priority/usage based policies, 
> then a workaround should be offered.

Can maui/moab kill idle jobs?  I think this might be what you really want.

What about a minimum cput *rate*?  It would be more complicated, but we
could have pbs_mom kill jobs that are sitting around not gaining cput
(it would even be trustworthy now that cput updates are getting fixed.)
(though I still think maui/moab will do this better.)

 
> Besides, having the ability to route jobs based on whether they are 
> batch or interactive is a useful job management feature in itself. So, I 
> guess I am asking for a couple of feature additions to Torque here.

I'm trying to push the idea that batch and interactive jobs should be
_equivalent_ as far as policy is concerned.  The admin shouldn't care
what form the user chooses to use, but merely the resources requested
and used.

 
> Would the powers that be please comment on how much work it would be to 
> add the ability to a) route jobs based on their type and/or b) disable 
> interactive jobs at the server level.

Trivial.  The only question is how the queue attribute would look.  

I'm pushing against this only because I feel like it is "yet another
attribute" cluttering up the manpages, and the actual purpose is better
handled in maui/moab.



More information about the torqueusers mailing list