[torqueusers] Is resources_used.mem reliable? + make -lpvmem working to set virtual memory limit in ulimit

David Jackson jacksond at clusterresources.com
Thu Apr 21 11:42:14 MDT 2005


Ake,

  This patch has been added to TORQUE 1.2.0p3

Dave

On Tue, 2005-04-12 at 18:51 +0200, Ake wrote:
> On Tue, Apr 12, 2005 at 06:41:02PM +0200, etienne gondet wrote:
> > 
> >    Dear torque folks
> > 
> >    I would also prefer to use -lpvem to limit jobs max memory usage 
> > including data and stack.
> > I noticed that limiting virtual memory with ulimit were working with 
> > ulmit -Sv 1000000
> > a.out asking for more tht 1 Gb of stack+size+heap are killed.
> > 
> >    But torque 1.2.0.p1 when you specify -lpvem=1024mb limits stack and 
> > data to 1 gb
> > so a fortran code may use 1 gb  of local or automatci variables + 1 gb 
> > of BSS+data+heap.
> > what is not what we hoped.
> > 
> >    I looked on rehat + linux FAQS and they advices to replace 
> > RLIMIT_DATA by RLIMIT_AS
> > to limit virtual memory :
> > 
> >        So I modify the src/resmom/linux/mom_mach.c like that from line 
> > 1214  :
> > if (set_mode == SET_LIMIT_SET)
> >    {
> >    /* if either of vmem or pvmem was given, set sys limit to lesser */
> > 
> >    if (mem_limit != 0)
> >      {
> >      reslim.rlim_cur = reslim.rlim_max = mem_limit;
> > 
> > /* Replace _DATA by _AS to modify virtual memory in ulimit -Sa/Ha
> >      if (setrlimit(RLIMIT_DATA,&reslim) < 0)
> > ETG 11/04/2005 */
> >      if (setrlimit(RLIMIT_AS,&reslim) < 0)
> >        {
> >        return(error("RLIMIT_AS",PBSE_SYSTEM));
> >        }
> > 
> > /* To avoid at qsub messages : -bash: ulimit: stack size: cannot modify 
> > limit: Invalid argument
> >      if (setrlimit(RLIMIT_STACK,&reslim) < 0)
> >        {
> >        return(error("RLIMIT_STACK",PBSE_SYSTEM));
> >        }
> > */
> >      }
> 
> This is just part of what needs to be done.
> 
> Since i'm currently buried in other work i'm attaching the two patches i
> was talking about in my last mail.  If any one feels up to it please
> read them carefully and test them.
> 
> The resource limiting patch makes sure that vmem limit is never smaller
> then the mem limit but it uses (like before) the smaller of pvmem/vmem.
> I sent an earlier (and broken) version of this patch to the list some
> time ago and i think Greg tested it. Please test this one instead since
> it is as far as i can see working correctly.
> 
> The vmem reporting patch needs to be applied to ALL mom-nodes in a
> cluster at the same time since it affects communication between sister
> and Mother Superior, it doesn't affect the server.
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list