[Mauiusers] maui limits? looking for experience
Gareth.Williams at csiro.au
Gareth.Williams at csiro.au
Sat Oct 1 05:57:58 MDT 2011
> -----Original Message-----
> From: Arnau Bria [mailto:arnaubria at pic.es]
> Sent: Friday, 30 September 2011 8:23 AM
> To: mauiusers at supercluster.org
> Subject: Re: [Mauiusers] maui limits? looking for experience
> On Fri, 30 Sep 2011 07:23:52 +1000
> Gareth.Williams at csiro.au Gareth.Williams at csiro.au wrote:
> Hi Gareth,
> > > This really improves maui behaviour. But limiting idle queue was
> > > last thing I wanted to do....
> > Idle limits are mostly good. This mostly limits the number of each
> > users jobs that maui will consider in any scheduling cycle so it make
> > the scheduling cycle shorter/faster. It also limits the priority
> > accumulated by queued jobs and alleviates 'queue stuffing'. I'd
> > recommend idle limits given that maui does not contain a better
> > facility to handle such issues.
> Yep, a short queue reduces maui stress. I completely agree that.
> Seting a limit of 100 jobs per user leaves a 1k idle queue in normal
> behaviour, when many user are running jobs. That's the limit I'd use.
> But, as I've never tried this before, let me ask how maui will behave
> in this situation:
> if the farm is 70%, and I have only two users who have submited jobs
> (user A and B). User A has much more priority than user B, so let's say
> that the 30% must be filled with 25% of jobs from user A and 5% jobs
> from user B, if I have 1000 jobs in queue (500 from A and 500 from B)
> IDLE queue will contain 100 jobs of each user, so each scheduling
> cycle is going to schedule 200 jobs, is maui going to fill up the farm
> respecting our policies (25/5)? or is it going to start 100 jobs from
> each user on each scheduling cycle filling up the farm 15% and 15%?
I have to guess more than I'd like to in order to answer. However, it might be good to start by noting that the question presumes that maui starts many jobs in one cycle. I think this is rare (but have no idea what the limit is if there is one). Most commonly, one might hope the cluster is busy so maui can only start a small number of jobs when other jobs finish. Jobs with highest priority reserve resources in the future to ensure they get started, other jobs may get backfilled. Only jobs that fit within the idle limits will be considered (for which they must be in an execution queue too).
If the cluster is relatively idle, then it will probably fill up with work from whoever queues it first.
If you have target shares, you can only meet them with a relatively consistent balance if all the shareholders submit plenty of jobs that don't have very different requirements. Otherwise you need to be content with reasonable balance over the long term and fair-share scheduling helping the priority to slosh back and forth usefully.
So to your numbers. If the cluster is at 70% either a lot of work finished at once or nobody has been queuing work for a while. If the latter is true, probably one person will queue first and fill most of the 30% (if they have enough jobs). When someone else submits jobs, if enough time has passed, fair-share with kick in and their jobs will get the higher start priority. If it's the former, jobs at the top of the priority list should get started. If the main factor is fair-share then this probably means mostly one persons jobs so they can catch up to their target.
If you have lost of very small jobs, well that is a challenge. Perhaps you can demonstrate to your users how to aggregate them into modest groups to get maybe 1-2hr jobs. Their overall throughput might increase... It might also be work that no-one wants to do.
> > > If I understand routing queues properly, they send jobs based on
> > > required resources. our jobs do not require any special resource,
> > > our users send jobs based on queue name that show time limits. So,
> > > I think that routing queues can't help here.
> > What is being proposed is that you have a routing queue setup with no
> > special resources, just one routing queue per execution queue (but
> > make it as fancy as you like - though simple is good). Put a limit
> > on the number of (users) jobs in the execution queue(s) (enough to
> > fill the cluster) but allow many jobs in the routing queue(s). Maui
> > only need consider the execution queue so it's job becomes simpler
> > and it can be faster.
> ok. now I understand. So, "hide" jobs to maui using routing queues.
> > Gareth (who used maui for some time but doesn't now)
> I've not said that. I'm just asking for other admin (which much
> experience) experience.
> Many thanks for your reply,
More information about the mauiusers