[Mauiusers] nodeavailabilitypolicy

Renato Borges renato.callado.borges at gmail.com
Thu Dec 16 03:15:30 MST 2010

Hi Abhi!

On Wed, Dec 15, 2010 at 7:21 PM, Abhishek Gupta <abhig at princeton.edu> wrote:

> Hi,
> I am trying to figure out the way so that memory usage does not exceed
> the available memory on a node. I was thinking that this parameter (
> NODEAVAILABILITYPOLICY COMBINED:MEM ) should check the availability of
> node on the bases of memory available, but it does not.
> Is there anything else I need to add to make it work?
> Thanks,
> Abhi.

I´ve never used NODEAVAILABILITYPOLICY, but I have a similar problem, which
is: the jobs we run at my site start out with a small memory footprint, and
end with large amounts of data in memory (in virtualization lingo, they
"balloon"). Maybe this is also your case, and this is why setting this
variable doesn`t work?

To avoid swapping, I have set a MAXJOBPERUSER variable for each compute
node, because all of our jobs that have an increasing memory footprint come
from a single user (actually, a grid account).

Tweaking the MAXJOBPERUSER variable, I have found a value for each node (we
have an heterogeneous cluster) that runs the jobs without swapping.

However, this is not ideal because this setting is applied to all jobs that
run on a given node, and some local users have jobs that are small in
memory, but large in number of cores, and the limits which I set for the
grid jobs are too restrictive for them. Whereas a grid job can only run 4
jobs on a 8 core, 8GB RAM node, local user´s jobs could merrily run on all 8
cores simultaneously.

Trying to find a better solution, I found that one can set on torque
(supposing you use torque):

qmgr -c "set queue XXX resources_min.mem=2000kb"

And this would (theoretically) only attribute nodes that have at least 2GB
of free memory to waiting jobs on XXX queue. I say "theoretically" because I
have not had luck with this setting. As I said, our grid jobs balloon, and
so our nodes get one job per slot, since initially (for the first few hours)
the jobs are only downloading data, and so there is always 2GB free. But
when the memories ballon, we start swapping heavily.

I guess that you might have more luck with that if your jobs´ memory
footprint is more constant, or if some guru could teach us how to "reserve"
some memory amount per job, I know that would suit me perfectly.


Renato Callado Borges
Lab Specialist - DFN/IF/USP
Email: rborges at dfn.ifusp.br
Phone: +55 11 3091 7105
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20101216/f3fb9cc3/attachment.html 

More information about the mauiusers mailing list