[torqueusers] Question about what does PBS_NUM_NODES and PBS_NUM_PPN means

Glen Beane glen.beane at gmail.com
Tue Dec 7 09:31:29 MST 2010


On Tue, Dec 7, 2010 at 11:13 AM, David Beer <dbeer at adaptivecomputing.com> wrote:
> Bas,
>
> $PBS_NUM_NODES is the number of nodes assigned to the job, and $PBS_NUM_PPN is the ppn assigned to the job. As you have discovered, these are currently only compatible with single request jobs (nodes= with no '+').
>
> Cheers,
>
> David

This kind of limits the usefulness of this information...   A user can
get more accurate information by parsing the nodefile, but if we
wanted to make this information easier to get why not put it in a
file?  One line per node allocated, the format could be something
like:

$PBS_NODENUM:ppn

so for a job that requested nodes=4:ppn=16 you would end up with a
file like this:

0:16
1:16
2:16
3:16


then we just set a environment variable that points to the location of
this file.

However, this idea probably has a few problems as well -- I still
think it is better than a static ENV variable.  I think in the future
there might be a concept of a dynamically sized job that can
grow/shrink, in that case at least the pbs_mom can rewrite the file,
but there might be a better way to convey that information.

This is the type of change that should be discussed by the TORQUE
community before they are made -- the approach clearly has
limitations, perhaps we could have come up with a better solution by
just spending a little time talking about it first.


-glen


More information about the torqueusers mailing list