[torqueusers] job run on single node when request for multiple nodes

Gus Correa gus at ldeo.columbia.edu
Tue Jun 15 10:20:33 MDT 2010


Hi Govind

What is the content of your Torque nodes file?
(located at /var/torque/server_priv/nodes or equivalent)
It should list all nodes and the number of processors/cores on each,
Something like this:

node01 np=2
node02 np=8
...
node47 np=4
...

Maybe I asked you this before in the OpenMPI list, not sure.

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Govind wrote:
> Hi,
> 
> I have am openmpi build with tm support
> When i run the mpi job requesting for two nodes and 4 core it run only 
> on single node.
>  
>  >cat mpipbs-script.sh
> #PBS -N mpipbs-script
> #PBS -q short
> ### Number of nodes: resources per node
> ### (4 cores/node, so ppn=4 is ALL resources on the node)
> #PBS -l nodes=2:ppn=4
> echo `cat $PBS_NODEFILE`
> NPROCS=`wc -l < $PBS_NODEFILE`
> echo This job has allocated $NPROCS processors
> /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello
> 
> It   show only one node here is the output
> ===============
> node47.beowulf.cluster node47.beowulf.cluster node47.beowulf.cluster 
> node47.beowulf.cluster
> This job has allocated 4 processors
> Hello World! from process 1 out of 4 on node47.beowulf.cluster
> Hello World! from process 2 out of 4 on node47.beowulf.cluster
> Hello World! from process 3 out of 4 on node47.beowulf.cluster
> Hello World! from process 0 out of 4 on node47.beowulf.cluster
> ===============
> 
> torque config
> set queue short resources_max.nodes = 4
> set queue short resources_default.nodes = 1
> set server resources_default.neednodes = 1
> set server resources_default.nodect = 1
> set server resources_default.nodes = 1
> 
> I also trying adding this parameter to maui as advised in some  old 
> thread but does not help
> JOBNODEMATCHPOLICY     EXACTNODE
> ENABLEMULTINODEJOBS   TRUE
> NODEACCESSPOLICY         SHARED
> 
> 
> Can someone please advise if i missing anything here.
> 
> Regards
> Govind
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list