[torquedev] Patch for Torque MOM to set the locked memory limit

Brock Palen brockp at umich.edu
Fri Aug 17 10:29:53 MDT 2007


You don't need this.  (the following is credit to garrick)
All child processes of pbs_mom inherit its limits that pbs_mom was  
started under.
We use openmpi+ofed+torque+tm all the time, all we do is in

/etc/init.d/pbs_mom
we added:

ulimit -l 1048576

Before we start the pbs_mom,  pbs_mom will start with these limits,  
and thus any process created by the mom (a openmpi job)  will also  
have a limit of 1GB.

Its a pain, but its simple.

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985


On Aug 17, 2007, at 12:03 PM, Eygene Ryabinkin wrote:

> Good day.
>
> After some days spent with OpenMPI and Torque I came up with the
> patch that enables Torque MOM to set the locked memory limits for
> the tasks it is spawning.  The rationale behind this is that OpenMPI
> needs more locked memory than the default 32K (RHEL 4.4).  Setting
> it via /etc/security/limits.conf doesn't help when one is using
> mpiexec-like tool that uses Torque/PBS API to spawn jobs, because
> PAM subsystem isn't used in this case.
>
> Sure, there is the option to set the value via the init.d script
> and jobs spawned by MOM will inherit the values.  But we can do it
> later on the per-queue basis, so the programmatic way is a bit
> better.
>
> The patch follows, it was made and tested for Torque 2.1.8.
> Documentation wan't touched yet.
>
> -----
>> From 437073e7a4972c616f6c55ec9855bbd03e4da59e Mon Sep 17 00:00:00  
>> 2001
> From: Eygene Ryabinkin <rea+maui at grid.kiae.ru>
> Date: Fri, 17 Aug 2007 17:30:08 +0400
> Subject: [PATCH] Customize locked memory size limit for Torque MOM.
>
> Some software, most notably OpenMPI, demands very high setting for
> the locked memory limit in Linux.  The usual value of 32K is not
> sufficient even for OpenMPI's MPI initialization on EM64T.
>
> New argument named 'rlimit_memlock' was introduced for the pbs_mom
> configuration file.  It specifies the amount of the memory in
> kilobytes that will be available to the non-root entities in the
> mlock()/mlockall() operations.  Negative values requests an infinite
> limit to be set.
>
> It would be nice to have this limit to be set on the per-queue
> basis, but it requires more coding and I am not going to do it now.
> But this is good TODO entry: add per-queue limits that override
> the MOM's defaults (if any).
>
> Signed-off-by: Eygene Ryabinkin <rea+maui at grid.kiae.ru>
> ---
>  src/resmom/linux/mom_mach.c |   39 ++++++++++++++++++++++++++++++++ 
> +++++++
>  src/resmom/mom_main.c       |   24 ++++++++++++++++++++++++
>  2 files changed, 63 insertions(+), 0 deletions(-)
>
> diff --git a/src/resmom/linux/mom_mach.c b/src/resmom/linux/mom_mach.c
> index 1e04bac..fc015db 100644
> --- a/src/resmom/linux/mom_mach.c
> +++ b/src/resmom/linux/mom_mach.c
> @@ -176,6 +176,9 @@ extern  int     LOGLEVEL;
>  extern  char    CHECKPOINT_SCRIPT[1024];
>  extern  char    PBSNodeMsgBuf[1024];
>
> +extern  long	memlock_limit;
> +extern  int	memlock_flag;
> +
>  /*
>  ** external functions and data
>  */
> @@ -1092,6 +1095,42 @@ int mom_set_limits(
>    pres = (resource *)GET_NEXT(pjob->ji_wattr[(int) 
> JOB_ATR_resource].at_val.at_list);
>
>    /*
> +   * Set the size of the locked memory pages if it was requested.
> +   */
> +  if (set_mode == SET_LIMIT_SET && memlock_flag != 0)
> +    {
> +    if (memlock_limit < 0)
> +      {
> +      reslim.rlim_cur = RLIM_INFINITY;
> +      reslim.rlim_max = RLIM_INFINITY;
> +      }
> +      else
> +      {
> +      reslim.rlim_cur = memlock_limit;
> +      reslim.rlim_max = memlock_limit;
> +      }
> +
> +    if (setrlimit(RLIMIT_MEMLOCK,&reslim) < 0)
> +      {
> +      sprintf(log_buffer,"cannot set locked memory size limit to % 
> ld for job %s (setrlimit failed - check default user limits)",
> +        (long)(reslim.rlim_max),
> +        pjob->ji_qs.ji_jobid);
> +
> +      log_err(errno,id,log_buffer);
> +
> +      log_buffer[0] = '\0';
> +
> +      return(error(pname,PBSE_SYSTEM));
> +      }
> +      else
> +      {
> +      sprintf(log_buffer,"set locked memory size to %ld",  
> reslim.rlim_max);
> +      log_record(PBSEVENT_SYSTEM,0,id,log_buffer);
> +      log_buffer[0] = '\0';
> +      }
> +    }
> +
> +  /*
>     * cycle through all the resource specifications,
>     * setting limits appropriately.
>     */
> diff --git a/src/resmom/mom_main.c b/src/resmom/mom_main.c
> index 23da98f..a909bcb 100644
> --- a/src/resmom/mom_main.c
> +++ b/src/resmom/mom_main.c
> @@ -224,6 +224,9 @@ extern time_t   pbs_tcp_timeout;
>
>  char            tmpdir_basename[MAXPATHLEN];  /* for $TMPDIR */
>
> +long		memlock_limit = 0;
> +int		memlock_flag = 0;
> +
>  char            rcp_path[MAXPATHLEN];
>  char            rcp_args[MAXPATHLEN];
>  char            xauth_path[MAXPATHLEN];
> @@ -315,6 +318,7 @@ static unsigned long setcheckpolltime(char *);
>  static unsigned long settmpdir(char *);
>  static unsigned long setlogfilemaxsize(char *);
>  static unsigned long setlogfilerolldepth(char *);
> +static unsigned long setmemlocklimit(char *);
>
>  static struct specials {
>    char            *name;
> @@ -349,6 +353,7 @@ static struct specials {
>      { "tmpdir",       settmpdir },
>      { "log_file_max_size", setlogfilemaxsize},
>      { "log_file_roll_depth", setlogfilerolldepth},
> +    { "rlimit_memlock", setmemlocklimit},
>      { NULL,           NULL } };
>
>
> @@ -2354,6 +2359,25 @@ static unsigned long setlogfilerolldepth(
>     return 1;
>     }
>
> +static unsigned long setmemlocklimit(
> +
> +  char *value)  /* I */
> +
> +  {
> +  log_record 
> (PBSEVENT_SYSTEM,PBS_EVENTCLASS_SERVER,"rlimit_memlock",value);
> +
> +  memlock_limit = strtol(value, NULL, 10);
> +  if (errno != 0)
> +    {
> +    memlock_limit = 0;
> +    return(0);	/* error */
> +    }
> +
> +  memlock_flag = 1;
> +  return(1);
> +  }  /* END setmemlocklimit() */
> +
> +
>
>  void check_log()
>     {
> -- 
> 1.5.2.1
> -----
> -- 
> Eygene Ryabinkin, RRC KI
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>
>



More information about the torquedev mailing list