[torqueusers] How to get number of allocated
CPUs within jobscript?
Prakash Velayutham
velayups at email.uc.edu
Fri Apr 7 15:59:14 MDT 2006
Garrick Staples wrote:
> On Fri, Apr 07, 2006 at 04:31:02PM -0400, Andrew J Caird alleged:
>
>> On Thu, 6 Apr 2006, Norbert Paschedag wrote:
>>
>>
>>> There was a patch for torque floating around on this list (I think) that
>>> would add additional environment variables $NCPUS (similar to what
>>> PBSpro is doing on SMPs) and $PBS_NCPUS. Worked fine on our cluster.
>>>
>>> If you're interested I can send you my version of the patch.
>>>
>> Norbert,
>>
>> I'd like a copy of that patch.
>>
>> Garrick, will this be rolled into the main Torque stream at some point?
>>
>
> IIRC, I originally objected to that patch because it put the number of
> CPUs directly in the env var, prohibiting the future feature of
> dynamically sized jobs. The actual value needs to be put in a file, and
> the path to that file would be in the env var (like $PBS_NODEFILE) so
> that the value can be modified.
>
> An alternative solution is to ask TM. We could have a simple prog that
> reports the correct value from MS.
>
> Whichever method is used, the value must be computed by the number of
> vnodes in exec_host. Having MOM parse out various things from the job
> resources is wrong.
>
> Also, in the case where 'wc -l $PBS_NODEFILE' does not match the value
> you want, then I would consider that a bug in TORQUE or the scheduler.
> Specifically, when the job has ncpus=X (X>1) and exec_host is 1 vnode
> (which happens with at least maui), I'd say is a bug.
Garrick,
What do you mean when you say dynamically sized jobs? Could you explain
a little more on this please?
I am interested as I think I am working on something similar, so this
interests me.
Thanks,
Prakash
More information about the torqueusers
mailing list