[Mauiusers] Backfill and node reservation

Bogdan Costescu bcostescu at gmail.com
Mon Nov 15 14:53:23 MST 2010


On Mon, Nov 15, 2010 at 18:49, Arnau Bria <arnaubria at pic.es> wrote:
> My idea is: sending a high prio job that request all node cpus will
> prevent other jobs to run in that node except if backfill is enabled.

That sounds OK.

>> Why ? What happens on the node to make it unusable for the backfilled
>> short jobs that could run in the meantime ?
>
> A queue? I mean, if my job is first one, why second one starts before
> mine? So, only some kind of order is taken in consideration?

You still haven't answered my question: why do you want to disable
backfill ? I'm not trying to argue for or against it - I'd just like
to understand.
Backfill will allow a lower priority job to start before a higher
priority one if the time to finish the lower priority one fits within
the window. This is the design of backfill - to allow short jobs to
fill in the empty time which would otherwise be wasted. If you want to
maintain a strict priority based policy, then you should disable
backfilling.

>> > But this scenario is not working.  Seems that backfill is not
>> > dissabled cause top queue jobs "are not blocking" low prio jobs.

I think that you don't understand how backfilling works and we are
talking about different things. Please read my paragraph above and
re-read the docs.

> Hope so, cause it's clear in my mind :-) Now I'd like to understand
> if it's a software problem or conceptual one.

Now it seems very likely a conceptual one ;-)

> If my jobs starts means that node is empty, so the reboot is safe.

You are right. However, this approach would actually benefit from
backfilling, as opposed to the solution I have proposed - the node is
not marked offline, so short low priority jobs could still run on this
node if they fit in the window.
This is similar in spirit to the one Roy Dragseth has suggested, which
uses a reservation to provide a deadline; in your case it's the
starting of the high-priority job which provides the deadline, but in
both cases backfilling could still use the window until the deadline
to run short lower priority jobs. (And Roy gets extra points for
providing a Maui based suggestion on a Maui related mailing list,
unlike me ;-))

> I don't know why my idea is so bad :-)

It's not bad, but badly explained so far ;-)

> Is it simplest than my idea?
> We could use puppet or whatever to do so, but adding a simple line to
> sudo that allow special user to reboot and a rc.local script are only
> conf needed.

Hmm, let's see how many components you rely on: Maui on the master,
<new script> on the master, puppet daemon on the master, puppet client
on the nodes, cron on the nodes, sudo on the nodes, LDAP or again
puppet (master + clients) or initial automated setup to have a
uniformly distributed user to run the commands as. There are quite a
lot of them to be set up and especially to work together
_synchronized_. Then there is the issue of keeping that uniformly
distributed user secure - I consider a cluster to be useless if it can
be randomly rebooted by someone else than the admin.

As opposed to this, the solution which I have proposed uses: Torque on
the master, <new script> on the master, remote reboot capability
(ssh/ipmitool/func/etc.).

Another difference is that your script requires knowledge about the
nr. of CPUs that the nodes have, to submit jobs to exactly fill each
node; mine doesn't and Roy's suggestion with reservations also
doesn't. Guess which one is simpler when you have nodes with varying
nr. of CPUs ;-)

I like things to be as simple as possible - makes the debugging
exponentially simpler.

Cheers,
Bogdan


More information about the mauiusers mailing list