[torqueusers] unique identifiers per taks and PBS_VNODENUM?

Garrick Staples garrick at clusterresources.com
Wed Aug 16 20:32:07 MDT 2006


On Wed, Aug 16, 2006 at 10:14:04PM -0400, Andrew J Caird alleged:
> 
> Hello,
> 
> I'm looking for a way to determine which task a given process is 
> in, and it sort of looks like PBS_VNODENUM might be that, but it 
> doesn't seem to be set all the time, and it seems sporadic when 
> it is set.  Below is a simple code and the output from multiple 
> runs.

Your program works fine for me.  Are all of your nodes running a recent
version of TORQUE?

Note that PBS_VNODENUM is only useful when launching tasks through the
TM interface.  open-mpi, lam/mpi, and OSC's mpiexec all support the TM interface.

 
> What is PBS_VNODENUM supposed to do?

The vnode is the processor within the node.  If you have nodes=2:ppn=2,
that is 2 nodes and 4 vnodes.

 
> Is there an environment variable that can be read by tasks that 
> can identify them uniquely?

Why not just use MPI_Comm_rank()?


This should demonstrate useful variables when using the TM interface:

[garrick at hpcjr0008 garrick]$ qstat -f $PBS_JOBID | grep Resource_List.nodes
    Resource_List.nodes = 4:ppn=2
[garrick at hpcjr0008 garrick]$ pbsdsh -s bash -c 'echo $PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM'
0 0 2
0 1 3
1 2 4
1 3 5
2 4 6
2 5 7
3 6 8
3 7 9
[garrick at hpcjr0008 garrick]$ pbsdsh -s bash -c 'echo $PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM'
0 0 10
0 1 11
1 2 12
1 3 13
2 4 14
2 5 15
3 6 16
3 7 17
[garrick at hpcjr0008 garrick]$ pbsdsh -s bash -c 'echo $PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM'
0 0 18
0 1 19
1 2 20
1 3 21
2 4 22
2 5 23
3 6 24
3 7 25

Though, note that OSC mpiexec gets it wrong because it always runs
processes on the first vnode of each node:

[garrick at hpcjr0013 garrick]$ mpiexec --comm=none echo '$PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM' | sort
0 0 42
0 0 43
1 2 44
1 2 45
2 4 46
2 4 47
3 6 48
3 6 49



More information about the torqueusers mailing list