[Mauiusers] maui limits? looking for experience

Arnau Bria arnaubria at pic.es
Wed Oct 5 08:04:36 MDT 2011


On Sat, Oct 1, 2011 at 1:57 PM, <Gareth.Williams at csiro.au> wrote:

> > -----Original Message-----
> > From: Arnau Bria [mailto:arnaubria at pic.es]
> > Sent: Friday, 30 September 2011 8:23 AM
> > To: mauiusers at supercluster.org
> > Subject: Re: [Mauiusers] maui limits? looking for experience
> >
> > On Fri, 30 Sep 2011 07:23:52 +1000
> > Gareth.Williams at csiro.au Gareth.Williams at csiro.au wrote:
> >
>
> Hi Arnau,
>

Hi Gareth,


> I have to guess more than I'd like to in order to answer. However, it might
> be good to start by noting that the question presumes that maui starts many
> jobs in one cycle. I think this is rare (but have no idea what the limit is
> if there is one).  Most commonly, one might hope the cluster is busy so maui
> can only start a small number of jobs when other jobs finish. Jobs with
> highest priority reserve resources in the future to ensure they get started,
> other jobs may get backfilled. Only jobs that fit within the idle limits
> will be considered (for which they must be in an execution queue too).
>

Ideally it should start few jobs, but due to our "big" schedulling cycle (we
need it because when maui schedules it does not respond to user commands)
and the fact that maui stops schedulling jobs when it finds a "new" node
busy (is a "bug" we are seeing since we increased our farm size, already
reported here, maybe I should open a bug) some cycles could start more than
200 jobs. As the ideal IDLE limit should be around 100 jobs per user, we'll
be facing the issue previosuly described.

If the cluster is relatively idle, then it will probably fill up with work
> from whoever queues it first.
>
> If you have target shares, you can only meet them with a relatively
> consistent balance if all the shareholders submit plenty of jobs that don't
> have very different requirements.  Otherwise you need to be content with
> reasonable balance over the long term and fair-share scheduling helping the
> priority to slosh back and forth usefully.
>

Our case is the first you describe, and, after some test, we've met a good
FS configuration and we're quite happy with FS targets.But as we have many
users/groups we need a "long" IDLE queue in order to have enough jobs from
any group / user.

base        8.5%     -
lhcatlas    38.46% 37.42%
lhccms     23.53% 22.16%
lhclhcb     14.81% 14.39%
lhtier2      14.48% 14.24%
localat3    0.16% 1.9%
magic       0.06% 3.8%
pau          0.0% 3.8%


So to your numbers.  If the cluster is at 70% either a lot of work finished
> at once or nobody has been queuing work for a while. If the latter is true,
> probably one person will queue first and fill most of the 30% (if they have
> enough jobs).  When someone else submits jobs, if enough time has passed,
> fair-share with kick in and their jobs will get the higher start priority.
>  If it's the former, jobs at the top of the priority list should get
> started.  If the main factor is fair-share then this probably means mostly
> one persons jobs so they can catch up to their target.
>

This example you're talking about is a compressive scenario, but we're
assuming that we have no queued jobs. But that IDLE 30% could be due to a
drain of some blades of our farm, so, when we set all those nodes online
again (with more than 1k jobs in queue)  and maui not seeing the "real"
queue, it's goint to fill up the farm without respecting FS.

Maybe this situations isn't the most critical (I could control it by hand),
but I'm really worried about the normal scheduling cycle we see in our farm.
Last week we gave IDLE limit a new oportunity and we saw really bad results.



> If you have lost of very small jobs, well that is a challenge.  Perhaps you
> can demonstrate to your users how to aggregate them into modest groups to
> get maybe 1-2hr jobs.  Their overall throughput might increase...  It might
> also be work that no-one wants to do.
>

Unfortunately 85% of our jobs are grid  jobs, and changing "users" behaviour
is really complicated.

I'll play with NODEPOLLFREQUENCY and JOBAGGREGATIONTIME and see if I get an
increase of maui performance.


Gareth
>

Many thanks for your help Gareth,
Cheers,
Arnau
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20111005/bda4fcde/attachment.html 


More information about the mauiusers mailing list