[torqueusers] job run on single node when request for multiple nodes

Gus Correa gus at ldeo.columbia.edu
Tue Jun 15 11:50:13 MDT 2010


Hi Govind

Please, read what Garrick said about your server configuration.
To unset stuff do:
qmgr -c 'unset server resources_default.neednodes'
and so on ...

Besides, your *Torque* (not OpenMPI) nodes file should list
*all* nodes on your cluster with the respective
number of processors/cores.
If I understood what you said right,
you only have one node listed there.

Something like this:

node01 np=4
node02 np=4
... (and so on ...)

I hope this helps,
Gus Correa


Govind Songara wrote:
> Hi Gus,
> 
> Thanks for your reply, with your advise I build openMPI.
> I have  standard nodes list as
> node01 np=4
> 
> Thanks
> Govind
> 
> On 15 June 2010 17:20, Gus Correa <gus at ldeo.columbia.edu 
> <mailto:gus at ldeo.columbia.edu>> wrote:
> 
>     Hi Govind
> 
>     What is the content of your Torque nodes file?
>     (located at /var/torque/server_priv/nodes or equivalent)
>     It should list all nodes and the number of processors/cores on each,
>     Something like this:
> 
>     node01 np=2
>     node02 np=8
>     ...
>     node47 np=4
>     ...
> 
>     Maybe I asked you this before in the OpenMPI list, not sure.
> 
>     I hope this helps,
>     Gus Correa
>     ---------------------------------------------------------------------
>     Gustavo Correa
>     Lamont-Doherty Earth Observatory - Columbia University
>     Palisades, NY, 10964-8000 - USA
>     ---------------------------------------------------------------------
> 
> 
>     Govind wrote:
>      > Hi,
>      >
>      > I have am openmpi build with tm support
>      > When i run the mpi job requesting for two nodes and 4 core it run
>     only
>      > on single node.
>      >
>      >  >cat mpipbs-script.sh
>      > #PBS -N mpipbs-script
>      > #PBS -q short
>      > ### Number of nodes: resources per node
>      > ### (4 cores/node, so ppn=4 is ALL resources on the node)
>      > #PBS -l nodes=2:ppn=4
>      > echo `cat $PBS_NODEFILE`
>      > NPROCS=`wc -l < $PBS_NODEFILE`
>      > echo This job has allocated $NPROCS processors
>      > /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello
>      >
>      > It   show only one node here is the output
>      > ===============
>      > node47.beowulf.cluster node47.beowulf.cluster node47.beowulf.cluster
>      > node47.beowulf.cluster
>      > This job has allocated 4 processors
>      > Hello World! from process 1 out of 4 on node47.beowulf.cluster
>      > Hello World! from process 2 out of 4 on node47.beowulf.cluster
>      > Hello World! from process 3 out of 4 on node47.beowulf.cluster
>      > Hello World! from process 0 out of 4 on node47.beowulf.cluster
>      > ===============
>      >
>      > torque config
>      > set queue short resources_max.nodes = 4
>      > set queue short resources_default.nodes = 1
>      > set server resources_default.neednodes = 1
>      > set server resources_default.nodect = 1
>      > set server resources_default.nodes = 1
>      >
>      > I also trying adding this parameter to maui as advised in some  old
>      > thread but does not help
>      > JOBNODEMATCHPOLICY     EXACTNODE
>      > ENABLEMULTINODEJOBS   TRUE
>      > NODEACCESSPOLICY         SHARED
>      >
>      >
>      > Can someone please advise if i missing anything here.
>      >
>      > Regards
>      > Govind
>      >
>      >
>      >
>     ------------------------------------------------------------------------
>      >
>      > _______________________________________________
>      > torqueusers mailing list
>      > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>      > http://www.supercluster.org/mailman/listinfo/torqueusers
> 
>     _______________________________________________
>     torqueusers mailing list
>     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list