[torqueusers] NCPUS environment variable?

Martin Siegert siegert at sfu.ca
Sat Jul 16 16:03:54 MDT 2005


On Tue, Jul 12, 2005 at 02:22:11PM -0400, Glen Beane wrote:
> I think torque should be extended to provide $PBS_NCPUS to the pbs  
> script,  but there are a few ways around your problem.
> 
> New versions of torque pass additional information to the prologue.   
> Included in this new information is the resource requested string.   
> This includes walltime, and the node request line.  I have had  
> success parsing the node request string in my prologue to calculate  
> the exact number of CPUs requested  (for example, my prologue will  
> deal with more complicated strings like nodes=16:ppn=2 
> +8:ppn=1:some_resource+node128:ppn=2).  It would be fairly easy to  
> look for ncpus= as well in the resource requested string.
> 
> another option is to have your prologue run a qstat -f on your job.  
> You can probably parse through the output and figure out the number  
> of CPUs as well.

Using "qstat -f" to find the number of CPUs is definitely easy.
However, when I suggested this I got the response from one of my
colleagues that the PBS server receives about 200000 queries per day
and that running qstat in the prolog or submissions scripts would
double that. That basically killed that idea.

Parsing the node request string in the prolog would definitely work
but with strings as complicated as you indicated I was simply too
lazy to implement that particularly as the pbs_mom does that work
already in the start_exec routine.

Thus, I made an attempt to extend torque to provide a PBS_NCPUS
variable. The situation is actually (at least for my taste) quite
confusing [disclaimer: I started using torque only recently and have
no experience with other batch queueing systems, thus I do not know
anything about the history, etc. of these variables]:

#PBS -l ncpus=<x>
a) sets PBS_NODENUM to 0
b) sets PBS_TASKNUM to 1
c) prints "numnodes=1 numvnod=1" to the logfile (for LOGLEVEL >= 2)
d) qstat shows -- under the NDS and <x> under the TSK columns
e) creates a one line PBS_NODEFILE containing a single host

#PBS -l nodes=<y>:ppn=<z>
a) sets PBS_NODENUM to 0
b) sets PBS_TASKNUM to 1
c) prints "numnodes=<y> numvnod=N", N = <y>*<z>, to the logfile (LOGLEVEL >= 2)
d) qstat shows <y> under the NDS and -- under the TSK columns
e) creates PBS_NODEFILE containing N lines.

What is the reasoning behind the values of PBS_NODENUM and PBS_TASKNUM?
They do not seem to contain any useful information. Furthermore, their
numbers don't match the numbers from qstat.
[another disclaimer: all my test were done using moab as scheduler, but
I doubt that the results actually depend on the scheduler].

Since I do not understand the meaning of PBS_NODENUM and PBS_TASKNUM
I chose not to touch these and introduced a new variable PBS_NCPUS,
patch is attached. The patch applies to torque-1.2.0p4. In my tests
it provides the correct total number of cpus in both cases (-l ncpus=...
and -l nodes=...)

Cheers,
Martin

-- 
Martin Siegert
Head, HPC at SFU
WestGrid Site Manager
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert at sfu.ca
Canada  V5A 1S6
-------------- next part --------------
--- torque-1.2.0p4/src/resmom/start_exec.c.orig	Tue Jul 12 18:14:05 2005
+++ torque-1.2.0p4/src/resmom/start_exec.c	Sat Jul 16 13:49:04 2005
@@ -173,7 +173,8 @@
   "PBS_NODENUM",
   "PBS_TASKNUM",
   "PBS_MOMPORT",
-  "PBS_NODEFILE" };
+  "PBS_NODEFILE",
+  "PBS_NCPUS" };
 
 static	char *variables_env[NUM_LCL_ENV_VAR];
 
@@ -1318,6 +1319,8 @@
 
   job                  *pjob;
   task                 *ptask;
+  resource             *resc;
+  long                 ncpus = 1;
 
   struct passwd        *pwdp;
 
@@ -1494,6 +1497,22 @@
     fclose(nhow);
     }  /* END if (pjob->ji_flags & MOM_HAS_NODEFILE) */
 
+  /* PBS_NCPUS
+     first check "ncpus" resource, then vnodes */
+
+  resc = find_resc_entry(
+              &pjob->ji_wattr[(int)JOB_ATR_resource],
+              find_resc_def(svr_resc_def,"ncpus",svr_resc_size));
+  if (resc != NULL) {
+     ncpus = resc->rs_value.at_val.at_long;
+  }
+  if (pjob->ji_numvnod > ncpus) {
+     ncpus = pjob->ji_numvnod;
+  }
+  sprintf(buf,"%d",
+          ncpus);
+  bld_env_variables(&vtable,variables_else[12],buf);
+
 #if defined(PENABLE_CPUSETS) || defined(PENABLE_DYNAMIC_CPUSETS)
 
 #ifdef PENABLE_DYNAMIC_CPUSETS


More information about the torqueusers mailing list