[torqueusers] strange behaviour of ppn

Govind govind.rhul at googlemail.com
Tue Nov 23 09:22:15 MST 2010


Thanks Bryant for details about tpn, i can not find any details about tpn in
torque manual.
using tpn, i can start tasks on different node, but still it does not meet
the expected behaviour.

1. If my jobs scripts looks like this
   #PBS -l nodes=4,tpn=4
   mpirun -np 16 hostname
  It gives output of  total 16 hostname, i.e run it on 4 tasks on each of 4
nodes
2. If in job scripts use sleep like this
  #PBS -l nodes=4,tpn=4
   mpirun -np 16 sleep 200
  It shows running only 4 task on 4 different nodes.

So it looks like me it run the tasks only on single slot out of 4 slots.

Please advise if there anything needs to be change in the config.

Regards
Govind


On Wed, Nov 17, 2010 at 9:50 PM,
<pat.o'bryant at exxonmobil.com<pat.o%27bryant at exxonmobil.com>
> wrote:

>
>
> Govind,
>     Go to the Adaptive Resource web page and get the Torque manual in PDF
> format. Next search for "tpn" which stands for "task-per-node". There is an
> explanation of how "ppn" and "tpn" are different.
> >From your test cases it is as though your jobs are being interpreted in a
> task fashion and not a node fashion. So, when the statement "nodes=3:ppn=1"
> is made, this is a request for (3 x 1) tasks instead of what you intended.
> Try this instead: "nodes=3,tpn=1". The request says "3 nodes" with "1 task
> per node". Note that there is a "comma" after the "nodes" values and not a
> ":". Hopefully you will get a better result. The use of ppn can be
> confusing.
>       Thanks,
>        Pat
>
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
>
>
>
>
>
>             Govind
>             <govind.rhul at g
>             ooglemail.com>                                             To
>             Sent by:                 Torque Users Mailing List
>             torqueusers-bo           <torqueusers at supercluster.org>
>             unces at superclu                                             cc
>             ster.org
>                                                                   Subject
>                                      Re: [torqueusers] strange behaviour
>             11/17/10 09:51           of ppn
>             AM
>
>
>             Please respond
>                   to
>              Torque Users
>              Mailing List
>             <torqueusers at s
>             upercluster.or
>                   g>
>
>
>
>
>
>
>
>
> Hi Brian,
>
> I do'nt want to block a complete node for a single job.
> My requirement is to request multiple processor on different nodes which is
> not working at the moment.
>
> Thanks
> Govind
>
>
> On Mon, Nov 15, 2010 at 5:53 PM, Andrus, Brian Contractor <
> bdandrus at nps.edu
> > wrote:
>  Govind,
>
>  You may want to add:
>
>  #PBS -l naccesspolicy=singlejob
>
>  This will cause allocation to be a single job per node.
>  Given your resource request, you do not specify you need exclusive use of
>  the node, so as far as torque is concerned, there are processors
>  available to be assigned to jobs, which it does.
>
>  Brian Andrus
>
>
>
>  ________________________________
>
>  From: torqueusers-bounces at supercluster.org on behalf of Govind Songara
>  Sent: Fri 11/12/2010 8:25 AM
>  To: Torque Users Mailing List
>  Subject: [torqueusers] strange behaviour of ppn
>
>
>  Hi,
>
>
>  I am not expert on torque configurations, so might something wrong with
>  configurations.
>  I am seeing a strange behaviour of ppn variable.
>  My nodes config is something like
>  node01 np=4
>  node02 np=4
>
>  snippet of maui config
>  JOBNODEMATCHPOLICY     EXACTNODE
>  ENABLEMULTINODEJOBS   TRUE
>  NODEACCESSPOLICY         SHARED
>
>
>  snippet of queue config        resources_available.nodect =
>  65        resources_assigned.nodect = 5        resources_default.nodes = 1
>
>  sample script
>  ------------------------------------
>  #PBS -q long
>  #PBS -l nodes=2:ppn=1
>
>  echo This jobs runs on the following processors:
>  echo `cat $PBS_NODEFILE`
>  NPROCS=`wc -l < $PBS_NODEFILE`
>  echo This job has allocated $NPROCS processors
>  hostname
>  ------------------------------------
>
>  Below is my result in the tables
>
>
>
>  nodes
>
>  ppn
>
>  no. process run (hostname)
>
>  no. pf processor allocated
>
>  3
>
>  1
>
>  1
>
>  3
>
>  3
>
>  2
>
>  1
>
>  2
>
>  3
>
>  3
>
>  1
>
>  3
>
>  3
>
>  4
>
>  1
>
>  4
>
>  In case 1, it gives 3 processor on same node which is incorrect, it
>  should give 1 processor on 3 different nodes
>  In case2, it give only 2 processor on same node, it should 2 processor on
>  3 different nodes (total 6 processor) and similar behaviour with the last
>  tow cases.
>  In all the cases the hostname command run only once, which should run at
>  least on total number of allocated processors.
>
>
>  Due to this strange behaviour i can not run mpi jobs correctly, kindly
>  advise on this problem.
>
>  TIA
>
>  Regards
>  Govind
>
>
>
>
>
>
>  _______________________________________________
>  torqueusers mailing list
>  torqueusers at supercluster.org
>  http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20101123/8b1c1bd5/attachment.html 


More information about the torqueusers mailing list