[Mauiusers] maui (or torque?) policy on node runtime and allocation
scrusan at ur.rochester.edu
Thu Oct 13 11:45:49 MDT 2011
-----BEGIN PGP SIGNED MESSAGE-----
On Oct 13, 2011, at 1:22 PM, Jim Kusznir wrote:
> Hi all:
> I would like to configure maui on my cluster to ensure that at least
> 60% of the cluster is available within 24 hours. The goal here is to
> prevent users from consuming the majority of the cluster on jobs that
> take more than a day (thus causing turn-around problems for other
> users, many of whom have short jobs to queue).
> I originally tried to do this with a default queue
> (max_walltime=24hrs) and a long queue (no max walltime), with the
> intent to restrict that long cannot use more than 33% of the nodes.
> However, I haven't figured out how to do that. At this point, its
> that that important to me that I do get it exactly like that, but if
> there's a way to ensure that at least 66% of the online cluster will
> be available within 24hrs, then that will work.
I was in a similar situation, where we had some infiniband nodes that we wanted to be dual purpose. Basically, if no one is using the infiniband functionality, normal ethernet jobs should run on those nodes. What I didn't want to happen was that users whom queued their jobs to use the infiniband queue would have to wait for normal jobs to finish.
Especially if you consider our normal jobs span 5 days, someone could be waiting in the infiniband queue a long time. Even worse, what if they wanted to run jobs that span ALL the nodes??? That could take quite awhile to be scheduled.
Now, mind you, I did this with Moab, but I don't see how this wouldn't work with Maui.
Basically, I created a standing reservation with the proposed infiniband nodes, and set a max job time to 2 days.
# SR - ibres
# only allow jobs running less than 48 hours to use the infiniband nodes
SRCFG[ibres] QOSLIST=ibqos,shared MAXTIME=*48:00:00
The QOS pieces are particular to our setup, but this should work. In a but shell, any jobs that are part of the shared OR ibqos QOS's have access to those nodes. The QOS's are mapped via CLASSCFG...
A benefit of this is that MANY short running jobs are placed on these nodes, so we've seeing better scheduling efficiency also.
This might work for you.
> I've been trying to figure out how to do that with maui's docs, but I
> don't really know what I'm looking for, so I haven't been able to
> locate it.
> mauiusers mailing list
> mauiusers at supercluster.org
Center for Research Computing
University of Rochester
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
-----END PGP SIGNATURE-----
More information about the mauiusers