[Mauiusers] Backfill and node reservation

Arnau Bria arnaubria at pic.es
Mon Nov 15 15:23:21 MST 2010


On Mon, 15 Nov 2010 22:53:23 +0100
Bogdan Costescu wrote:

Hi Bogdan,

first of all, thanks for your reply.
I feel that I'm getting a lot of help with this issue now.

> >> Why ? What happens on the node to make it unusable for the
> >> backfilled short jobs that could run in the meantime ?
> >
> > A queue? I mean, if my job is first one, why second one starts
> > before mine? So, only some kind of order is taken in consideration?
> 
> You still haven't answered my question: why do you want to disable
> backfill ? I'm not trying to argue for or against it - I'd just like
> to understand.
> Backfill will allow a lower priority job to start before a higher
> priority one if the time to finish the lower priority one fits within
> the window. This is the design of backfill - to allow short jobs to
> fill in the empty time which would otherwise be wasted. If you want to
> maintain a strict priority based policy, then you should disable
> backfilling.

Ok, now I get your point, and you're completly right: why dissable
backfill if it ensures that node will be running other SHORT jobs
meanwhile it gets drained? The problem, that I've not explained
clearly, is that the "reserved" node is not starting SHORT jobs but
anyone. So it's not respecting the window. (The important info is that
our cluster runs 98% of jobs in long queues, more than 24 hours, so in
my case, I'm assuming that backfill should "block" 98% of jobs, so
starting my question with this info should helped a lot!).

So, i.e, imagine my node (with 4 cpus) is runnning 3 jobs and the
longest one will run for 3 more hours. I send my high prio job, so
from then, maui should start jobs that fit the window, in other
words, no longer than 3 hours. But it's starting any job from queue,
the first one, does not matter if it's 2 hours or 24 hours long.

Did I explain myself clear now? Do you understand now why I'm saying
that backfill is not working? sorry Bogdan, I missed some important
info. I apology for that.


[...]


> > I don't know why my idea is so bad :-)
> 
> It's not bad, but badly explained so far ;-)

Completely agree. Sorry again.
 
> > Is it simplest than my idea?
> > We could use puppet or whatever to do so, but adding a simple line
> > to sudo that allow special user to reboot and a rc.local script are
> > only conf needed.
> 
> Hmm, let's see how many components you rely on: Maui on the master,
> <new script> on the master, puppet daemon on the master, puppet client
> on the nodes, cron on the nodes, sudo on the nodes, LDAP or again
> puppet (master + clients) or initial automated setup to have a
> uniformly distributed user to run the commands as. There are quite a
> lot of them to be set up and especially to work together
> _synchronized_. Then there is the issue of keeping that uniformly
> distributed user secure - I consider a cluster to be useless if it can
> be randomly rebooted by someone else than the admin.
> 
> As opposed to this, the solution which I have proposed uses: Torque on
> the master, <new script> on the master, remote reboot capability
> (ssh/ipmitool/func/etc.).
> 
> Another difference is that your script requires knowledge about the
> nr. of CPUs that the nodes have, to submit jobs to exactly fill each
> node; mine doesn't and Roy's suggestion with reservations also
> doesn't. Guess which one is simpler when you have nodes with varying
> nr. of CPUs ;-)
> 
> I like things to be as simple as possible - makes the debugging
> exponentially simpler.

Ok, so maybe mine needs more components, but managing a cluster you
probaly have something like CFE, Puppet,rocks,a complet kickstart file
whatever that make things easy/automatic...


I'm still not sure in how to proceed with this issue. I still like my
idea, cause with my current infrastucture seems simple, but seeing that
many of you do things from master... make sense.

> Cheers,
> Bogdan
Many thanks for you reply Bogdan,
Cheers,
Arnau


More information about the mauiusers mailing list