rjh+maui at cita.utoronto.ca
Thu Mar 10 17:40:54 MST 2005
Chris Samuel suggested I asked the experts on this list how I might
setup suspend/resume with maui, and how well I might expect it to work.
The goal would be to let short (say < 30min) serial and medium parallel
(up to say 32 node/64 cpu) jobs run quickly so that users can see if
their batch scripts and job startup work before launching a large run.
Essentially a development/testing queue.
I've only recently started learning about tweaking maui, but from what
I've read it seems possible to setup a QOS within maui to grab a few
short walltime jobs, and set them up as PREEMPTORs and some long walltime
jobs could be PREEMPTEEs.
Am I right in thinking that maui can successfully suspend and resume
long jobs to let short devel jobs run?
Our cluster has RedHat7.3 (argh! old!), torque-1.1.0p5snap025-2,
maui-3.2.6p11-1, and parallel jobs use >= lam-7.1 (tm boots the lamd's).
Robin Humble http://www.cita.utoronto.ca/~rjh/
More information about the mauiusers