[torqueusers] job run on single node when request for multiple nodes
Gus Correa
gus at ldeo.columbia.edu
Tue Jun 15 11:50:13 MDT 2010
Hi Govind
Please, read what Garrick said about your server configuration.
To unset stuff do:
qmgr -c 'unset server resources_default.neednodes'
and so on ...
Besides, your *Torque* (not OpenMPI) nodes file should list
*all* nodes on your cluster with the respective
number of processors/cores.
If I understood what you said right,
you only have one node listed there.
Something like this:
node01 np=4
node02 np=4
... (and so on ...)
I hope this helps,
Gus Correa
Govind Songara wrote:
> Hi Gus,
>
> Thanks for your reply, with your advise I build openMPI.
> I have standard nodes list as
> node01 np=4
>
> Thanks
> Govind
>
> On 15 June 2010 17:20, Gus Correa <gus at ldeo.columbia.edu
> <mailto:gus at ldeo.columbia.edu>> wrote:
>
> Hi Govind
>
> What is the content of your Torque nodes file?
> (located at /var/torque/server_priv/nodes or equivalent)
> It should list all nodes and the number of processors/cores on each,
> Something like this:
>
> node01 np=2
> node02 np=8
> ...
> node47 np=4
> ...
>
> Maybe I asked you this before in the OpenMPI list, not sure.
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
> Govind wrote:
> > Hi,
> >
> > I have am openmpi build with tm support
> > When i run the mpi job requesting for two nodes and 4 core it run
> only
> > on single node.
> >
> > >cat mpipbs-script.sh
> > #PBS -N mpipbs-script
> > #PBS -q short
> > ### Number of nodes: resources per node
> > ### (4 cores/node, so ppn=4 is ALL resources on the node)
> > #PBS -l nodes=2:ppn=4
> > echo `cat $PBS_NODEFILE`
> > NPROCS=`wc -l < $PBS_NODEFILE`
> > echo This job has allocated $NPROCS processors
> > /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello
> >
> > It show only one node here is the output
> > ===============
> > node47.beowulf.cluster node47.beowulf.cluster node47.beowulf.cluster
> > node47.beowulf.cluster
> > This job has allocated 4 processors
> > Hello World! from process 1 out of 4 on node47.beowulf.cluster
> > Hello World! from process 2 out of 4 on node47.beowulf.cluster
> > Hello World! from process 3 out of 4 on node47.beowulf.cluster
> > Hello World! from process 0 out of 4 on node47.beowulf.cluster
> > ===============
> >
> > torque config
> > set queue short resources_max.nodes = 4
> > set queue short resources_default.nodes = 1
> > set server resources_default.neednodes = 1
> > set server resources_default.nodect = 1
> > set server resources_default.nodes = 1
> >
> > I also trying adding this parameter to maui as advised in some old
> > thread but does not help
> > JOBNODEMATCHPOLICY EXACTNODE
> > ENABLEMULTINODEJOBS TRUE
> > NODEACCESSPOLICY SHARED
> >
> >
> > Can someone please advise if i missing anything here.
> >
> > Regards
> > Govind
> >
> >
> >
> ------------------------------------------------------------------------
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list