[Mauiusers] REQUEUE and adjustment of resources

Yaroslav Halchenko lists at onerussian.com
Thu Sep 20 11:41:27 MDT 2007


Since no reply followed my inquiry and since I had not found any way to
accomplish dynamic adjustment of the requested memory resource I hacked
up a script which monitors the jobs and adjusts their required memory
resource if needed. If a job can be resumed on the node it was running
(ie if there is sufficient amount of memory left) -- script resumes it
(although maui seems to be smart enough to do it on its own). if there
is no memory available at the moment, it waits for specified amount of
time, and if memory becomes available it resumes it (some times before
maui figures things out ;-)). if it doesn't become available -- it
requeues the job to be ran with adjusted memory resources required.

Email notifications are sent whenever job is found in Suspended state or
whenever it was resumed/requeued. Also email notifications are sent if
the script exits.

The script is a hack so please don't judge me too much -- I am releasing
it with the hope that it might come useful to someone else as well.

On Thu, 13 Sep 2007, Yaroslav Halchenko wrote:

> Hi

> We imposed memory policy on the jobs, so whatever job is exceeding the
> requested amount gets suspended:

> RESOURCELIMITPOLICY     MEM:EXTENDEDVIOLATION:SUSPEND:00:01:00

> I am looking at the another choice -- to requeue the job. But I can't
> comprehend what that is for since for deterministic jobs memory
> consumption would be the same, and logically would be to have some
> automated way to raise up the memory requested by some amount (or may be
> even amount of memory with which it exceeded the requested amount
> earlier plus some lag).

> I wondered if may be I missed some maui configuration which would allow
> such automation. Otherwise I guess I will have to run a small cron job
> which would requeue suspended jobs with adjusted memory resource
> request.

> FWIW we are running maui 3.2.6p20-snap.1176920941-1
-- 
                                  .-.
=------------------------------   /v\  ----------------------------=
Keep in touch                    // \\     (yoh@|www.)onerussian.com
Yaroslav Halchenko              /(   )\               ICQ#: 60653192
                   Linux User    ^^-^^    [175555]


-------------- next part --------------
A non-text attachment was scrubbed...
Name: memory_control.py
Type: text/x-python
Size: 11072 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20070920/22da8d99/memory_control.py


More information about the mauiusers mailing list