[torqueusers] How to get number of allocated CPUs within jobscript?

Prakash Velayutham velayups at email.uc.edu
Fri Apr 7 15:59:14 MDT 2006


Garrick Staples wrote:
> On Fri, Apr 07, 2006 at 04:31:02PM -0400, Andrew J Caird alleged:
>   
>> On Thu, 6 Apr 2006, Norbert Paschedag wrote:
>>
>>     
>>> There was a patch for torque floating around on this list (I think) that 
>>> would add additional environment variables $NCPUS (similar to what 
>>> PBSpro is doing on SMPs) and $PBS_NCPUS. Worked fine on our cluster.
>>>
>>> If you're interested I can send you my version of the patch.
>>>       
>> Norbert,
>>
>> I'd like a copy of that patch.
>>
>> Garrick, will this be rolled into the main Torque stream at some point?
>>     
>
> IIRC, I originally objected to that patch because it put the number of
> CPUs directly in the env var, prohibiting the future feature of
> dynamically sized jobs.  The actual value needs to be put in a file, and
> the path to that file would be in the env var (like $PBS_NODEFILE) so
> that the value can be modified.
>
> An alternative solution is to ask TM.  We could have a simple prog that
> reports the correct value from MS.
>
> Whichever method is used, the value must be computed by the number of
> vnodes in exec_host.  Having MOM parse out various things from the job
> resources is wrong.
>
> Also, in the case where 'wc -l $PBS_NODEFILE' does not match the value
> you want, then I would consider that a bug in TORQUE or the scheduler.
> Specifically, when the job has ncpus=X (X>1) and exec_host is 1 vnode
> (which happens with at least maui), I'd say is a bug.
Garrick,

What do you mean when you say dynamically sized jobs? Could you explain 
a little more on this please?
I am interested as I think I am working on something similar, so this 
interests me.

Thanks,
Prakash


More information about the torqueusers mailing list