[torquedev] Patch for Torque MOM to set the locked memory limit

Eygene Ryabinkin rea+maui at grid.kiae.ru
Fri Aug 17 10:03:42 MDT 2007


Good day.

After some days spent with OpenMPI and Torque I came up with the
patch that enables Torque MOM to set the locked memory limits for
the tasks it is spawning.  The rationale behind this is that OpenMPI
needs more locked memory than the default 32K (RHEL 4.4).  Setting
it via /etc/security/limits.conf doesn't help when one is using
mpiexec-like tool that uses Torque/PBS API to spawn jobs, because
PAM subsystem isn't used in this case.

Sure, there is the option to set the value via the init.d script
and jobs spawned by MOM will inherit the values.  But we can do it
later on the per-queue basis, so the programmatic way is a bit
better.

The patch follows, it was made and tested for Torque 2.1.8.
Documentation wan't touched yet.

-----
>From 437073e7a4972c616f6c55ec9855bbd03e4da59e Mon Sep 17 00:00:00 2001
From: Eygene Ryabinkin <rea+maui at grid.kiae.ru>
Date: Fri, 17 Aug 2007 17:30:08 +0400
Subject: [PATCH] Customize locked memory size limit for Torque MOM.

Some software, most notably OpenMPI, demands very high setting for
the locked memory limit in Linux.  The usual value of 32K is not
sufficient even for OpenMPI's MPI initialization on EM64T.

New argument named 'rlimit_memlock' was introduced for the pbs_mom
configuration file.  It specifies the amount of the memory in
kilobytes that will be available to the non-root entities in the
mlock()/mlockall() operations.  Negative values requests an infinite
limit to be set.

It would be nice to have this limit to be set on the per-queue
basis, but it requires more coding and I am not going to do it now.
But this is good TODO entry: add per-queue limits that override
the MOM's defaults (if any).

Signed-off-by: Eygene Ryabinkin <rea+maui at grid.kiae.ru>
---
 src/resmom/linux/mom_mach.c |   39 +++++++++++++++++++++++++++++++++++++++
 src/resmom/mom_main.c       |   24 ++++++++++++++++++++++++
 2 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/src/resmom/linux/mom_mach.c b/src/resmom/linux/mom_mach.c
index 1e04bac..fc015db 100644
--- a/src/resmom/linux/mom_mach.c
+++ b/src/resmom/linux/mom_mach.c
@@ -176,6 +176,9 @@ extern  int     LOGLEVEL;
 extern  char    CHECKPOINT_SCRIPT[1024];
 extern  char    PBSNodeMsgBuf[1024];
 
+extern  long	memlock_limit;
+extern  int	memlock_flag;
+
 /*
 ** external functions and data
 */
@@ -1092,6 +1095,42 @@ int mom_set_limits(
   pres = (resource *)GET_NEXT(pjob->ji_wattr[(int)JOB_ATR_resource].at_val.at_list);
 
   /*
+   * Set the size of the locked memory pages if it was requested.
+   */
+  if (set_mode == SET_LIMIT_SET && memlock_flag != 0)
+    {
+    if (memlock_limit < 0)
+      {
+      reslim.rlim_cur = RLIM_INFINITY;
+      reslim.rlim_max = RLIM_INFINITY;
+      }
+      else
+      {
+      reslim.rlim_cur = memlock_limit;
+      reslim.rlim_max = memlock_limit;
+      }
+
+    if (setrlimit(RLIMIT_MEMLOCK,&reslim) < 0)
+      {
+      sprintf(log_buffer,"cannot set locked memory size limit to %ld for job %s (setrlimit failed - check default user limits)",
+        (long)(reslim.rlim_max),
+        pjob->ji_qs.ji_jobid);
+
+      log_err(errno,id,log_buffer);
+
+      log_buffer[0] = '\0';
+
+      return(error(pname,PBSE_SYSTEM));
+      }
+      else
+      {
+      sprintf(log_buffer,"set locked memory size to %ld", reslim.rlim_max);
+      log_record(PBSEVENT_SYSTEM,0,id,log_buffer);
+      log_buffer[0] = '\0';
+      }
+    }
+
+  /*
    * cycle through all the resource specifications,
    * setting limits appropriately.
    */
diff --git a/src/resmom/mom_main.c b/src/resmom/mom_main.c
index 23da98f..a909bcb 100644
--- a/src/resmom/mom_main.c
+++ b/src/resmom/mom_main.c
@@ -224,6 +224,9 @@ extern time_t   pbs_tcp_timeout;
 
 char            tmpdir_basename[MAXPATHLEN];  /* for $TMPDIR */
 
+long		memlock_limit = 0;
+int		memlock_flag = 0;
+
 char            rcp_path[MAXPATHLEN];
 char            rcp_args[MAXPATHLEN];
 char            xauth_path[MAXPATHLEN];
@@ -315,6 +318,7 @@ static unsigned long setcheckpolltime(char *);
 static unsigned long settmpdir(char *);
 static unsigned long setlogfilemaxsize(char *);
 static unsigned long setlogfilerolldepth(char *);
+static unsigned long setmemlocklimit(char *);
 
 static struct specials {
   char            *name;
@@ -349,6 +353,7 @@ static struct specials {
     { "tmpdir",       settmpdir },
     { "log_file_max_size", setlogfilemaxsize},
     { "log_file_roll_depth", setlogfilerolldepth},
+    { "rlimit_memlock", setmemlocklimit},
     { NULL,           NULL } };
 
 
@@ -2354,6 +2359,25 @@ static unsigned long setlogfilerolldepth(
    return 1;
    }
 
+static unsigned long setmemlocklimit(
+
+  char *value)  /* I */
+
+  {
+  log_record(PBSEVENT_SYSTEM,PBS_EVENTCLASS_SERVER,"rlimit_memlock",value);
+
+  memlock_limit = strtol(value, NULL, 10);
+  if (errno != 0)
+    {
+    memlock_limit = 0;
+    return(0);	/* error */
+    }
+  
+  memlock_flag = 1;
+  return(1);
+  }  /* END setmemlocklimit() */
+
+
 
 void check_log()
    {
-- 
1.5.2.1
-----
-- 
Eygene Ryabinkin, RRC KI


More information about the torquedev mailing list