[torqueusers] How to get number of allocated CPUs within jobscript?

Garrick Staples garrick at usc.edu
Fri Apr 7 15:44:48 MDT 2006


On Fri, Apr 07, 2006 at 04:31:02PM -0400, Andrew J Caird alleged:
> On Thu, 6 Apr 2006, Norbert Paschedag wrote:
> 
> >There was a patch for torque floating around on this list (I think) that 
> >would add additional environment variables $NCPUS (similar to what 
> >PBSpro is doing on SMPs) and $PBS_NCPUS. Worked fine on our cluster.
> >
> >If you're interested I can send you my version of the patch.
> 
> Norbert,
> 
> I'd like a copy of that patch.
> 
> Garrick, will this be rolled into the main Torque stream at some point?

IIRC, I originally objected to that patch because it put the number of
CPUs directly in the env var, prohibiting the future feature of
dynamically sized jobs.  The actual value needs to be put in a file, and
the path to that file would be in the env var (like $PBS_NODEFILE) so
that the value can be modified.

An alternative solution is to ask TM.  We could have a simple prog that
reports the correct value from MS.

Whichever method is used, the value must be computed by the number of
vnodes in exec_host.  Having MOM parse out various things from the job
resources is wrong.

Also, in the case where 'wc -l $PBS_NODEFILE' does not match the value
you want, then I would consider that a bug in TORQUE or the scheduler.
Specifically, when the job has ncpus=X (X>1) and exec_host is 1 vnode
(which happens with at least maui), I'd say is a bug.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060407/6dff3794/attachment.bin


More information about the torqueusers mailing list