[torqueusers] node memory limiting

rishi pathak mailmaverick666 at gmail.com
Fri Oct 30 02:07:33 MDT 2009


Do it in /etc/security/limits.conf for higher values. Some how they do not
work.
One more thing we have noticed is that pbs_mom does not inherit limits when
it starts at system boot. A restart of pbs_mom does the job.

On Thu, Oct 29, 2009 at 10:51 PM, Tony Schreiner <schreian at bc.edu> wrote:

> Somebody else alos offered this off-list.  I have tried that but
> without success so far.
>
> The existing script has a  ulimit -n 32768 which does seem to be
> overriding the default value, but when I put either ulimit -d 63000000
> or ulimit -m 63000000, neither one of those values seems to be in
> effect in the pbs session when I connect, both remain unlimited.
>
> On Oct 29, 2009, at 12:32 PM, Jerry Smith wrote:
>
> > Tony,
> >
> > Put the ulimit command in your init script for the mom, and the mom
> > will inherit those limits.
> >
> > --Jerry
> >
> > Tony Schreiner wrote:
> >> Torque 2.1.10, cluster consists of nodes with 64 GB or RAM,
> >> running  Fedora 10.
> >>
> >> There is a job that a user is running recently, that dynamically
> >> allocates increasing memory over time until all the memory on the
> >> node  is taken. I haven't talked to the developer, but I don't
> >> think it's a  bug (at least inadvertently).  But anyway, at that
> >> point the node  becomes totally unresponsive to Torque or to ssh.
> >>
> >> I thought I would set  the max data size in /etc/security/
> >> limits.conf  to 64000000 kb or just below the physical size.
> >>
> >> This is effective for ssh logins, but torque connections don't seem
> >> to  honor it. If I do an interactive job on the node and run ulimit
> >> -d it  shows "unlimited". I've rebooted for good measure.
> >>
> >> Do I have other options here?
> >>
> >> Thanks
> >> Tony Schreiner
> >> Boston College
> >> _______________________________________________
> >> torqueusers mailing list
> >> torqueusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
> >>
> >>
> >>
> >
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



-- 
Regards--
Rishi Pathak
National PARAM Supercomputing Facility
Center for Development of Advanced Computing(C-DAC)
Pune University Campus,Ganesh Khind Road
Pune-Maharastra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20091030/c8114d0d/attachment.html 


More information about the torqueusers mailing list