[torqueusers] qsub on several nodes

Glen Beane glen.beane at gmail.com
Tue Jun 1 13:50:21 MDT 2010


On Tue, Jun 1, 2010 at 2:56 PM, Felix Werner <ff.werner at gmail.com> wrote:
>
>
> 2010/6/1 Glen Beane <glen.beane at gmail.com>
>>
>> On Tue, Jun 1, 2010 at 12:16 PM, Ken Nielson
>> <knielson at adaptivecomputing.com> wrote:
>> > On 06/01/2010 10:08 AM, Felix Werner wrote:
>> >> Dear all,
>> >>
>> >> Suppose I want to run a job on 40 CPUs (with MPI),
>> >> and there are
>> >> 10 CPUs available on the node "node1"
>> >> 10 on "node2"
>> >> 20 on "node3".
>> >>
>> >> What I do is:
>> >>
>> >> qsub -l nodes=node1:ppn=10+node2:ppn=10+node3:ppn=20 shell_name.sh
>> >>
>> >> This is tedious because I need to look manually how many CPUs are
>> >> available on which node.
>> >>
>> >> So is there a way to just tell the queing system "I want to run on 40
>> >> CPUs, on whatever nodes"?
>> >>
>> >> Many thanks!
>> >>
>> >> Felix Werner
>> >>
>> >>
>> > Felix,
>> >
>> > If you use a scheduler like Moab you can simply use qsub -l nodes=40 and
>> > it will take care of where they are going to run. But if you are going
>> > to run jobs manually this is how it has to be done.
>> >
>>
>> one thing to note is that if you have fewer than 40 nodes you need to
>> trick TORQUE into thinking you have more nodes than it really has so
>> that it doesn't reject a request like -l nodes=40.  Moab treats a
>> nodes=X request without a ppn=Y component as a request stating "I just
>> need X processors".
>>
>> You can also do -l procs=40, which doesn't require configuring TORQUE
>> to think it has more nodes than it actually has. This is only
>> supported with Moab.
>>
>>
>
>    Many thanks guys!
>    I am not sure yet that it works perfectly though:
>
>    After executing:
> =====================
> [werner at cm64N TEST]$ qsub -l procs=3 shell_mc.sh
> 4237.cm64n.physics.umass.edu
> =====================
>    I get:
>
> ====================
> [werner at cm64N TEST]$ qstat -n1
>
> cm64n.physics.umass.edu:
>
> Req'd  Req'd   Elap
> Job ID               Username Queue    Jobname          SessID NDS   TSK
> Memory Time  S Time
> -------------------- -------- -------- ---------------- ------ ----- ---
> ------ ----- - -----
> 4229.cm64n.physi     werner   batch    shell_mc.sh       21133     9  --
> --    --  R 02:51
> cm33/15+cm33/14+cm33/13+cm33/12+cm33/11+cm33/10+cm33/9+cm33/8+cm33/7+cm33/6+cm33/5+cm33/4+cm33/3+cm33/2+cm33/1+cm33/0+cm34/15+cm34/14+cm34/13+cm34/12+cm34/11+cm34/10+cm34/9+cm34/8+cm34/7+cm34/6+cm34/5+cm34/4+cm34/3+cm34/2+cm34/1+cm34/0+cm36/15+cm36/14+cm36/13+cm36/12+cm36/11+cm36/10+cm36/9+cm36/8+cm36/7+cm36/6+cm36/5+cm36/4+cm36/3+cm36/2+cm36/1+cm36/0+cm37/15+cm37/14+cm37/13+cm37/12+cm37/11+cm37/10+cm37/9+cm37/8+cm37/7+cm37/6+cm37/5+cm37/4+cm37/3+cm37/2+cm37/1+cm37/0+cm39/15+cm39/14+cm39/13+cm39/12+cm39/11+cm39/10+cm39/9+cm39/8+cm39/7+cm39/6+cm39/5+cm39/4+cm39/3+cm39/2+cm39/1+cm39/0+cm40/15+cm40/14+cm40/13+cm40/12+cm40/11+cm40/10+cm40/9+cm40/8+cm40/7+cm40/6+cm40/5+cm40/4+cm40/3+cm40/2+cm40/1+cm40/0+cm41/15+cm41/14+cm41/13+cm41/12+cm41/11+cm41/10+cm41/9+cm41/8+cm41/7+cm41/6+cm41/5+cm41/4+cm41/3+cm41/2+cm41/1+cm41/0+cm43/15+cm43/14+cm43/13+cm43/12+cm43/11+cm43/10+cm43/9+cm43/8+cm43/7+cm43/6+cm43/5+cm43/4+cm43/3+cm43/2+cm43/1+cm43/0+cm46/15+cm46/14+cm46/13+cm46/12+cm46/11+cm46/10+cm46/9+cm46/8+cm46/7+cm46/6+cm46/5+cm46/4+cm46/3+cm46/2+cm46/1+cm46/0
> 4237.cm64n.physi     werner   batch    shell_mc.sh       21516     1  --
> --    --  R   --    cm47/0
> =======================
>
> So I guess this means that our sys admin indeed installed Moab..


to find out if your system has moab try running one of the moab
commands like "mdiag -n"

for your job 4237 I would expect to see something like
cm47/0+cm47/1+cm47/2.  I think maybe you are using a scheduler that
does not support -l procs and whatever scheduler you are using is
running your job on a single processor.


More information about the torqueusers mailing list