[torqueusers] output/error file size limit?

Garrick Staples garrick at usc.edu
Fri Oct 28 10:40:26 MDT 2005


On Fri, Oct 28, 2005 at 05:08:17PM +0100, gianfranco sciacca alleged:
> > You could try setting quotas on $PBS_HOME/spool on each of the compute
> > nodes, if the OS and filesystem you're using support them.
> > 
> > 	--Troy
> 
> Setting user quotas on the spool directory is feasible. How is PBS going 
> to treat a running job that happens to kill the quota?

Off the top of my head, I suspect MOM won't notice.  Since it happily
fills up partitions and doesn't abort the job when writes fail, it will
probably not notice when writes fail from quota violations.

We probably need to check for a write success somewhere.

Another option is to use the mom health check script.  Have it check the
file sizes and raise an error if one gets too big.  maui can coarsely
kill jobs on nodes that raise errors.   moab has fine grained control
based on the specific error message.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051028/61a087eb/attachment.bin


More information about the torqueusers mailing list