[torqueusers] qsub on several nodes

Felix Werner ff.werner at gmail.com
Tue Jun 1 12:56:37 MDT 2010


2010/6/1 Glen Beane <glen.beane at gmail.com>

> On Tue, Jun 1, 2010 at 12:16 PM, Ken Nielson
> <knielson at adaptivecomputing.com> wrote:
> > On 06/01/2010 10:08 AM, Felix Werner wrote:
> >> Dear all,
> >>
> >> Suppose I want to run a job on 40 CPUs (with MPI),
> >> and there are
> >> 10 CPUs available on the node "node1"
> >> 10 on "node2"
> >> 20 on "node3".
> >>
> >> What I do is:
> >>
> >> qsub -l nodes=node1:ppn=10+node2:ppn=10+node3:ppn=20 shell_name.sh
> >>
> >> This is tedious because I need to look manually how many CPUs are
> >> available on which node.
> >>
> >> So is there a way to just tell the queing system "I want to run on 40
> >> CPUs, on whatever nodes"?
> >>
> >> Many thanks!
> >>
> >> Felix Werner
> >>
> >>
> > Felix,
> >
> > If you use a scheduler like Moab you can simply use qsub -l nodes=40 and
> > it will take care of where they are going to run. But if you are going
> > to run jobs manually this is how it has to be done.
> >
>
> one thing to note is that if you have fewer than 40 nodes you need to
> trick TORQUE into thinking you have more nodes than it really has so
> that it doesn't reject a request like -l nodes=40.  Moab treats a
> nodes=X request without a ppn=Y component as a request stating "I just
> need X processors".
>
> You can also do -l procs=40, which doesn't require configuring TORQUE
> to think it has more nodes than it actually has. This is only
> supported with Moab.
>
>
>
   Many thanks guys!
   I am not sure yet that it works perfectly though:

   After executing:
=====================
[werner at cm64N TEST]$ qsub -l procs=3 shell_mc.sh
4237.cm64n.physics.umass.edu
=====================
   I get:

====================
[werner at cm64N TEST]$ qstat -n1

cm64n.physics.umass.edu:

Req'd  Req'd   Elap
Job ID               Username Queue    Jobname          SessID NDS   TSK
Memory Time  S Time
-------------------- -------- -------- ---------------- ------ ----- ---
------ ----- - -----
4229.cm64n.physi     werner   batch    shell_mc.sh       21133     9  --
--    --  R 02:51
cm33/15+cm33/14+cm33/13+cm33/12+cm33/11+cm33/10+cm33/9+cm33/8+cm33/7+cm33/6+cm33/5+cm33/4+cm33/3+cm33/2+cm33/1+cm33/0+cm34/15+cm34/14+cm34/13+cm34/12+cm34/11+cm34/10+cm34/9+cm34/8+cm34/7+cm34/6+cm34/5+cm34/4+cm34/3+cm34/2+cm34/1+cm34/0+cm36/15+cm36/14+cm36/13+cm36/12+cm36/11+cm36/10+cm36/9+cm36/8+cm36/7+cm36/6+cm36/5+cm36/4+cm36/3+cm36/2+cm36/1+cm36/0+cm37/15+cm37/14+cm37/13+cm37/12+cm37/11+cm37/10+cm37/9+cm37/8+cm37/7+cm37/6+cm37/5+cm37/4+cm37/3+cm37/2+cm37/1+cm37/0+cm39/15+cm39/14+cm39/13+cm39/12+cm39/11+cm39/10+cm39/9+cm39/8+cm39/7+cm39/6+cm39/5+cm39/4+cm39/3+cm39/2+cm39/1+cm39/0+cm40/15+cm40/14+cm40/13+cm40/12+cm40/11+cm40/10+cm40/9+cm40/8+cm40/7+cm40/6+cm40/5+cm40/4+cm40/3+cm40/2+cm40/1+cm40/0+cm41/15+cm41/14+cm41/13+cm41/12+cm41/11+cm41/10+cm41/9+cm41/8+cm41/7+cm41/6+cm41/5+cm41/4+cm41/3+cm41/2+cm41/1+cm41/0+cm43/15+cm43/14+cm43/13+cm43/12+cm43/11+cm43/10+cm43/9+cm43/8+cm43/7+cm43/6+cm43/5+cm43/4+cm43/3+cm43/2+cm43/1+cm43/0+cm46/15+cm46/14+cm46/13+cm46/12+cm46/11+cm46/10+cm46/9+cm46/8+cm46/7+cm46/6+cm46/5+cm46/4+cm46/3+cm46/2+cm46/1+cm46/0
4237.cm64n.physi     werner   batch    shell_mc.sh       21516     1  --
--    --  R   --    cm47/0
=======================

So I guess this means that our sys admin indeed installed Moab..

Now, as you can see, the job which I submitted on 3 processors using your
trick shows up (on the last line above) as if it was running only on one
processor.

According to our "cluster report" website, it is actually running on 3
processors, but I am wondering whether everything is OK (e.g., whether the
queing system will forbid to other jobs to run on the same processors, as it
should) ?

Thanks again!
Felix
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100601/0c1de458/attachment-0001.html 


More information about the torqueusers mailing list