[torqueusers] unique identifiers per taks and PBS_VNODENUM?
Garrick Staples
garrick at clusterresources.com
Wed Aug 16 20:32:07 MDT 2006
On Wed, Aug 16, 2006 at 10:14:04PM -0400, Andrew J Caird alleged:
>
> Hello,
>
> I'm looking for a way to determine which task a given process is
> in, and it sort of looks like PBS_VNODENUM might be that, but it
> doesn't seem to be set all the time, and it seems sporadic when
> it is set. Below is a simple code and the output from multiple
> runs.
Your program works fine for me. Are all of your nodes running a recent
version of TORQUE?
Note that PBS_VNODENUM is only useful when launching tasks through the
TM interface. open-mpi, lam/mpi, and OSC's mpiexec all support the TM interface.
> What is PBS_VNODENUM supposed to do?
The vnode is the processor within the node. If you have nodes=2:ppn=2,
that is 2 nodes and 4 vnodes.
> Is there an environment variable that can be read by tasks that
> can identify them uniquely?
Why not just use MPI_Comm_rank()?
This should demonstrate useful variables when using the TM interface:
[garrick at hpcjr0008 garrick]$ qstat -f $PBS_JOBID | grep Resource_List.nodes
Resource_List.nodes = 4:ppn=2
[garrick at hpcjr0008 garrick]$ pbsdsh -s bash -c 'echo $PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM'
0 0 2
0 1 3
1 2 4
1 3 5
2 4 6
2 5 7
3 6 8
3 7 9
[garrick at hpcjr0008 garrick]$ pbsdsh -s bash -c 'echo $PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM'
0 0 10
0 1 11
1 2 12
1 3 13
2 4 14
2 5 15
3 6 16
3 7 17
[garrick at hpcjr0008 garrick]$ pbsdsh -s bash -c 'echo $PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM'
0 0 18
0 1 19
1 2 20
1 3 21
2 4 22
2 5 23
3 6 24
3 7 25
Though, note that OSC mpiexec gets it wrong because it always runs
processes on the first vnode of each node:
[garrick at hpcjr0013 garrick]$ mpiexec --comm=none echo '$PBS_NODENUM $PBS_VNODENUM $PBS_TASKNUM' | sort
0 0 42
0 0 43
1 2 44
1 2 45
2 4 46
2 4 47
3 6 48
3 6 49
More information about the torqueusers
mailing list