[torquedev] Patch for Torque MOM to set the locked memory limit
Brock Palen
brockp at umich.edu
Fri Aug 17 10:29:53 MDT 2007
You don't need this. (the following is credit to garrick)
All child processes of pbs_mom inherit its limits that pbs_mom was
started under.
We use openmpi+ofed+torque+tm all the time, all we do is in
/etc/init.d/pbs_mom
we added:
ulimit -l 1048576
Before we start the pbs_mom, pbs_mom will start with these limits,
and thus any process created by the mom (a openmpi job) will also
have a limit of 1GB.
Its a pain, but its simple.
Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
On Aug 17, 2007, at 12:03 PM, Eygene Ryabinkin wrote:
> Good day.
>
> After some days spent with OpenMPI and Torque I came up with the
> patch that enables Torque MOM to set the locked memory limits for
> the tasks it is spawning. The rationale behind this is that OpenMPI
> needs more locked memory than the default 32K (RHEL 4.4). Setting
> it via /etc/security/limits.conf doesn't help when one is using
> mpiexec-like tool that uses Torque/PBS API to spawn jobs, because
> PAM subsystem isn't used in this case.
>
> Sure, there is the option to set the value via the init.d script
> and jobs spawned by MOM will inherit the values. But we can do it
> later on the per-queue basis, so the programmatic way is a bit
> better.
>
> The patch follows, it was made and tested for Torque 2.1.8.
> Documentation wan't touched yet.
>
> -----
>> From 437073e7a4972c616f6c55ec9855bbd03e4da59e Mon Sep 17 00:00:00
>> 2001
> From: Eygene Ryabinkin <rea+maui at grid.kiae.ru>
> Date: Fri, 17 Aug 2007 17:30:08 +0400
> Subject: [PATCH] Customize locked memory size limit for Torque MOM.
>
> Some software, most notably OpenMPI, demands very high setting for
> the locked memory limit in Linux. The usual value of 32K is not
> sufficient even for OpenMPI's MPI initialization on EM64T.
>
> New argument named 'rlimit_memlock' was introduced for the pbs_mom
> configuration file. It specifies the amount of the memory in
> kilobytes that will be available to the non-root entities in the
> mlock()/mlockall() operations. Negative values requests an infinite
> limit to be set.
>
> It would be nice to have this limit to be set on the per-queue
> basis, but it requires more coding and I am not going to do it now.
> But this is good TODO entry: add per-queue limits that override
> the MOM's defaults (if any).
>
> Signed-off-by: Eygene Ryabinkin <rea+maui at grid.kiae.ru>
> ---
> src/resmom/linux/mom_mach.c | 39 ++++++++++++++++++++++++++++++++
> +++++++
> src/resmom/mom_main.c | 24 ++++++++++++++++++++++++
> 2 files changed, 63 insertions(+), 0 deletions(-)
>
> diff --git a/src/resmom/linux/mom_mach.c b/src/resmom/linux/mom_mach.c
> index 1e04bac..fc015db 100644
> --- a/src/resmom/linux/mom_mach.c
> +++ b/src/resmom/linux/mom_mach.c
> @@ -176,6 +176,9 @@ extern int LOGLEVEL;
> extern char CHECKPOINT_SCRIPT[1024];
> extern char PBSNodeMsgBuf[1024];
>
> +extern long memlock_limit;
> +extern int memlock_flag;
> +
> /*
> ** external functions and data
> */
> @@ -1092,6 +1095,42 @@ int mom_set_limits(
> pres = (resource *)GET_NEXT(pjob->ji_wattr[(int)
> JOB_ATR_resource].at_val.at_list);
>
> /*
> + * Set the size of the locked memory pages if it was requested.
> + */
> + if (set_mode == SET_LIMIT_SET && memlock_flag != 0)
> + {
> + if (memlock_limit < 0)
> + {
> + reslim.rlim_cur = RLIM_INFINITY;
> + reslim.rlim_max = RLIM_INFINITY;
> + }
> + else
> + {
> + reslim.rlim_cur = memlock_limit;
> + reslim.rlim_max = memlock_limit;
> + }
> +
> + if (setrlimit(RLIMIT_MEMLOCK,&reslim) < 0)
> + {
> + sprintf(log_buffer,"cannot set locked memory size limit to %
> ld for job %s (setrlimit failed - check default user limits)",
> + (long)(reslim.rlim_max),
> + pjob->ji_qs.ji_jobid);
> +
> + log_err(errno,id,log_buffer);
> +
> + log_buffer[0] = '\0';
> +
> + return(error(pname,PBSE_SYSTEM));
> + }
> + else
> + {
> + sprintf(log_buffer,"set locked memory size to %ld",
> reslim.rlim_max);
> + log_record(PBSEVENT_SYSTEM,0,id,log_buffer);
> + log_buffer[0] = '\0';
> + }
> + }
> +
> + /*
> * cycle through all the resource specifications,
> * setting limits appropriately.
> */
> diff --git a/src/resmom/mom_main.c b/src/resmom/mom_main.c
> index 23da98f..a909bcb 100644
> --- a/src/resmom/mom_main.c
> +++ b/src/resmom/mom_main.c
> @@ -224,6 +224,9 @@ extern time_t pbs_tcp_timeout;
>
> char tmpdir_basename[MAXPATHLEN]; /* for $TMPDIR */
>
> +long memlock_limit = 0;
> +int memlock_flag = 0;
> +
> char rcp_path[MAXPATHLEN];
> char rcp_args[MAXPATHLEN];
> char xauth_path[MAXPATHLEN];
> @@ -315,6 +318,7 @@ static unsigned long setcheckpolltime(char *);
> static unsigned long settmpdir(char *);
> static unsigned long setlogfilemaxsize(char *);
> static unsigned long setlogfilerolldepth(char *);
> +static unsigned long setmemlocklimit(char *);
>
> static struct specials {
> char *name;
> @@ -349,6 +353,7 @@ static struct specials {
> { "tmpdir", settmpdir },
> { "log_file_max_size", setlogfilemaxsize},
> { "log_file_roll_depth", setlogfilerolldepth},
> + { "rlimit_memlock", setmemlocklimit},
> { NULL, NULL } };
>
>
> @@ -2354,6 +2359,25 @@ static unsigned long setlogfilerolldepth(
> return 1;
> }
>
> +static unsigned long setmemlocklimit(
> +
> + char *value) /* I */
> +
> + {
> + log_record
> (PBSEVENT_SYSTEM,PBS_EVENTCLASS_SERVER,"rlimit_memlock",value);
> +
> + memlock_limit = strtol(value, NULL, 10);
> + if (errno != 0)
> + {
> + memlock_limit = 0;
> + return(0); /* error */
> + }
> +
> + memlock_flag = 1;
> + return(1);
> + } /* END setmemlocklimit() */
> +
> +
>
> void check_log()
> {
> --
> 1.5.2.1
> -----
> --
> Eygene Ryabinkin, RRC KI
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>
>
More information about the torquedev
mailing list