[torqueusers] job run on single node when request for multiple nodes

Govind Songara govind.songara at rhul.ac.uk
Tue Jun 15 12:07:45 MDT 2010


Hi Gus,

I have unset
> set server resources_default.neednodes = 1
> set server resources_default.nodect = 1

I have 30 nodes listed in cluster and each one has 4 core
But still the job is running on one node.

Thanks
Govind


On 15 June 2010 18:50, Gus Correa <gus at ldeo.columbia.edu> wrote:

> Hi Govind
>
> Please, read what Garrick said about your server configuration.
> To unset stuff do:
> qmgr -c 'unset server resources_default.neednodes'
> and so on ...
>
> Besides, your *Torque* (not OpenMPI) nodes file should list
> *all* nodes on your cluster with the respective
> number of processors/cores.
> If I understood what you said right,
> you only have one node listed there.
>
> Something like this:
>
> node01 np=4
> node02 np=4
> ... (and so on ...)
>
> I hope this helps,
> Gus Correa
>
>
> Govind Songara wrote:
> > Hi Gus,
> >
> > Thanks for your reply, with your advise I build openMPI.
> > I have  standard nodes list as
> > node01 np=4
> >
> > Thanks
> > Govind
> >
> > On 15 June 2010 17:20, Gus Correa <gus at ldeo.columbia.edu
> > <mailto:gus at ldeo.columbia.edu>> wrote:
> >
> >     Hi Govind
> >
> >     What is the content of your Torque nodes file?
> >     (located at /var/torque/server_priv/nodes or equivalent)
> >     It should list all nodes and the number of processors/cores on each,
> >     Something like this:
> >
> >     node01 np=2
> >     node02 np=8
> >     ...
> >     node47 np=4
> >     ...
> >
> >     Maybe I asked you this before in the OpenMPI list, not sure.
> >
> >     I hope this helps,
> >     Gus Correa
> >     ---------------------------------------------------------------------
> >     Gustavo Correa
> >     Lamont-Doherty Earth Observatory - Columbia University
> >     Palisades, NY, 10964-8000 - USA
> >     ---------------------------------------------------------------------
> >
> >
> >     Govind wrote:
> >      > Hi,
> >      >
> >      > I have am openmpi build with tm support
> >      > When i run the mpi job requesting for two nodes and 4 core it run
> >     only
> >      > on single node.
> >      >
> >      >  >cat mpipbs-script.sh
> >      > #PBS -N mpipbs-script
> >      > #PBS -q short
> >      > ### Number of nodes: resources per node
> >      > ### (4 cores/node, so ppn=4 is ALL resources on the node)
> >      > #PBS -l nodes=2:ppn=4
> >      > echo `cat $PBS_NODEFILE`
> >      > NPROCS=`wc -l < $PBS_NODEFILE`
> >      > echo This job has allocated $NPROCS processors
> >      > /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello
> >      >
> >      > It   show only one node here is the output
> >      > ===============
> >      > node47.beowulf.cluster node47.beowulf.cluster
> node47.beowulf.cluster
> >      > node47.beowulf.cluster
> >      > This job has allocated 4 processors
> >      > Hello World! from process 1 out of 4 on node47.beowulf.cluster
> >      > Hello World! from process 2 out of 4 on node47.beowulf.cluster
> >      > Hello World! from process 3 out of 4 on node47.beowulf.cluster
> >      > Hello World! from process 0 out of 4 on node47.beowulf.cluster
> >      > ===============
> >      >
> >      > torque config
> >      > set queue short resources_max.nodes = 4
> >      > set queue short resources_default.nodes = 1
> >      > set server resources_default.neednodes = 1
> >      > set server resources_default.nodect = 1
> >      > set server resources_default.nodes = 1
> >      >
> >      > I also trying adding this parameter to maui as advised in some
>  old
> >      > thread but does not help
> >      > JOBNODEMATCHPOLICY     EXACTNODE
> >      > ENABLEMULTINODEJOBS   TRUE
> >      > NODEACCESSPOLICY         SHARED
> >      >
> >      >
> >      > Can someone please advise if i missing anything here.
> >      >
> >      > Regards
> >      > Govind
> >      >
> >      >
> >      >
> >
> ------------------------------------------------------------------------
> >      >
> >      > _______________________________________________
> >      > torqueusers mailing list
> >      > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org
> >
> >      > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >     _______________________________________________
> >     torqueusers mailing list
> >     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> >     http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100615/8f2b6225/attachment.html 


More information about the torqueusers mailing list