[Mauiusers] maui (or torque?) policy on node runtime and allocation

Steve Crusan scrusan at ur.rochester.edu
Thu Oct 13 11:45:49 MDT 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Oct 13, 2011, at 1:22 PM, Jim Kusznir wrote:

> Hi all:
> 
> I would like to configure maui on my cluster to ensure that at least
> 60% of the cluster is available within 24 hours.  The goal here is to
> prevent users from consuming the majority of the cluster on jobs that
> take more than a day (thus causing turn-around problems for other
> users, many of whom have short jobs to queue).
> 
> I originally tried to do this with a default queue
> (max_walltime=24hrs) and a long queue (no max walltime), with the
> intent to restrict that long cannot use more than 33% of the nodes.
> However, I haven't figured out how to do that.  At this point, its
> that that important to me that I do get it exactly like that, but if
> there's a way to ensure that at least 66% of the online cluster will
> be available within 24hrs, then that will work.


I was in a similar situation, where we had some infiniband nodes that we wanted to be dual purpose. Basically, if no one is using the infiniband functionality, normal ethernet jobs should run on those nodes. What I didn't want to happen was that users whom queued their jobs to use the infiniband queue would have to wait for normal jobs to finish. 

Especially if you consider our normal jobs span 5 days, someone could be waiting in the infiniband queue a long time. Even worse, what if they wanted to run jobs that span ALL the nodes??? That could take quite awhile to be scheduled.

Now, mind you, I did this with Moab, but I don't see how this wouldn't work with Maui. 

Basically, I created a standing reservation with the proposed infiniband nodes, and set a max job time to 2 days.

# SR - ibres
# only allow jobs running less than 48 hours to use the infiniband nodes
SRCFG[ibres]		QOSLIST=ibqos,shared MAXTIME=*48:00:00
SRCFG[ibres]		HOSTLIST=bh07[1-9],bh08[0-4]
SRCFG[ibres]		FLAGS=IGNSTATE,NOCHARGE
SRCFG[ibres]		PERIOD=INFINITY


The QOS pieces are particular to our setup, but this should work. In a but shell, any jobs that are part of the shared OR ibqos QOS's have access to those nodes. The QOS's are mapped via CLASSCFG...

A benefit of this is that MANY short running jobs are placed on these nodes, so we've seeing better scheduling efficiency also.

This might work for you.


> 
> I've been trying to figure out how to do that with maui's docs, but I
> don't really know what I'm looking for, so I haven't been able to
> locate it.
> 
> Thanks!
> --Jim
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers

 ----------------------
 Steve Crusan
 System Administrator
 Center for Research Computing
 University of Rochester
 https://www.crc.rochester.edu/


-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJOlyPUAAoJENS19LGOpgqKAsEH/igWGQn6QHhc75ZxXauX19Vj
hbVuX4iFTbojqrkx1C093pVpDuZNGLAnboYU7fR7lP7d5+bsrdshpIB4EHVwNTFy
bIJ7TEzpSbNsWyCD/SVimLuK26PMkTvDM0vzLYRE9bLv+EZ84D4Oy6ZT/tYbtaz1
M0Ey+BHWgWq9PeM0WgjnGUXlTMZeNiH+DdA0aSff8gZiQpBnIv3KYrgBnNpbKDEn
jpKabFn7LggLt5qvcc2pV2kCXdBl2/QA43px1QVyOxWUEknZcBg0Xte2RkDRoBUs
M/HEUhVHZooVYDTlWcic+vEyz2KmY6XQpC60RVkdrBf/HPrGOD5cQoAaBRMbO8k=
=98W5
-----END PGP SIGNATURE-----


More information about the mauiusers mailing list