[torqueusers] User's job can mess up the system so thatno jobs run

Atwood, Robert C r.atwood at imperial.ac.uk
Tue Sep 11 12:06:10 MDT 2007


 
David Singleton wrote:
> Jeroen van den Muyzenberg wrote:
> > This really isn't a torque issue, but site-defined policy.
> 
> Yes and no.  We are talking about PBS specific files here.
> 

I see a bit of both sides, but find that the  PBS specific nature
dominates -- from my point of view. Ideally Torque or whatever queueing
system would be installed as directed, and though 'bad'  jobs might
crash themselves they should not be easily able to crash other jobs. In
addition, implementing the quotas on the nodes' local scratch drives
involves patching into the cluster-builders' startup scripts etc. A
whole can of worms! And David's examples are not too far off, it may be
sort of acceptable to allow one user's mistake to crash all that
particular user's jobs so long as other users are okay, but, it's better
if that does not happen either.  Thanks for the patch David! I'll see if
I can figure out what needs changing for Torque 


More information about the torqueusers mailing list