[torqueusers] torque does not kill jobs when wall_time or cpu_time reached

David Singleton David.Singleton at anu.edu.au
Fri Jun 4 22:49:17 MDT 2010


On 06/05/2010 08:04 AM, Garrick Staples wrote:
> On Sat, Jun 05, 2010 at 07:37:46AM +1000, David Singleton alleged:
>> and code related to procs in Torque (same as the OpenPBS code) treats it
>> as processes (i.e. it sets RLIMIT_NPROC in a few MOMs).  You cant blame
>
> A recursive grep through trunk in subversion can't find a single instance of
> RLIMIT_NPROC. Not only is it not in the current code of any branch, but at no
> point was it ever present.
>

Sorry, the moral equivalent thereof.   Have a look through the Unicos8 mom_mach.c
(still there in Torque 2.3.3) and you'll see that its for limiting number of
processes, not number of processors.  From the Torque 2.3.3 resc_def_all.c
...
     {   "procs",                 /* number of processes per job */
...

My point was simply that procs was intended to specify number of processes and
nodes and procs were intended to be orthogonal resources.  With a set of
orthogonal resources (as it was in OpenPBS), using max for default when
default is not specified makes sense.  Non-orthogonal resources cause lots
of confusion unless there is extra code to "align" them.

Cheers,
David




More information about the torqueusers mailing list