[torqueusers] Problem with ppn and routing : Possible way to get the routing you want

Coyle, James J [ITACD] jjc at iastate.edu
Thu Dec 2 08:22:26 MST 2010


J.A. Magallon,

   I have a suggestion for this case.

   Create a submit filter, (or modify pbs_sched) so that
whenever nodes=N:ppn=P is used, then the calculation C=N*P
is performed and the resource request is changed so that
ncpus=C is added.

   Then issue
qsub -c 'set queue batch resources_max.ncpus = 1'

   Now a request of

qsub -lnodes=1:ppn=2
would be changed to
qsub -lnodes=1:ppn=2,npcus=2

which would be rejected by batch (because of ncpus).

   I am running 2.3.6, and it appears that nodes=N:ppn=P
takes precedence over npcus, so you will still get the sort
of node packing you want, npcus here just serves to aid the
routing queue.  

- Jim Coyle


 James Coyle, PhD
 High Performance Computing Group     
 115 Durham Center            
 Iowa State Univ.           phone: (515)-294-2099
 Ames, Iowa 50011           web: http://www.public.iastate.edu/~jjc



>-----Original Message-----
>From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
>bounces at supercluster.org] On Behalf Of J.A. Magallón
>Sent: Tuesday, November 30, 2010 7:29 PM
>To: torqueusers at supercluster.org
>Subject: Re: [torqueusers] Problem with ppn and routing
>
>On Tue, 30 Nov 2010 09:44:08 -0700 (MST), David Beer
><dbeer at adaptivecomputing.com> wrote:
>
>>
>>
>> ----- Original Message -----
>> > -snip-
>> > > set queue fast resources_max.nodes = 2:ppn=2
>> > -snip-
>> > > set queue batch resources_max.nodes = 1:ppn=1
>> >
>> > My understanding is that torque can/will only do useful
>comparisons on
>> > numeric fields so the above settings are not meaningful. You
>might be
>> > OK with resources_max.nodect (though that might not be numeric
>either)
>> > but could only filter on the number of nodes not the number of
>> > processes requested (and you would need a default nodes=1 which
>I
>> > would prefer not to set so we can use procs as an option...). I
>don't
>> > think this solves your problem but might point you (or others)
>in the
>> > right direction.
>> >
>> > -- Gareth
>>
>> At some point (I believe 2.5) we added the ability to use
>resources_max.nodes in queue limitations, but it only sorts based on
>the number of nodes, not ppn. We couldn't sort based on ppn because
>of the inherent ambiguities - which is larger, nodes=1:ppn=2 or
>nodes=2:ppn=1 - so we only sort based on the first number there.
>This means that a job requesting nodes=1:ppn=2 will be accepted by
>the batch queue.
>>
>> Additionally, if you would like to have jobs that request
>nodes=2:ppn=2 and need more walltime than allowed by the fast queue,
>you will have to create a new queue or modify the limits for fast.
>>
>
>OK, thanks. My idea was that a job would fit into a queue if it
>passed
>all conditions, nodes and ppn and walltime ....
>
>What do you mean with sorting ? What do you sort ?
>You could go probing if a job fits (wrt to ppn and nodes) in a queue
>until
>you find a good one.
>
>My problem is I dont depend on wall/cpu time, but I want to do
>something like:
>- If you ask many cores per node, you can only get X nodes and your
>time is
>  lmited to *:*:* (go to queue fast)
>- if you ask single core processes, you can get more nodes and live
>longer
>  (go to queue batch)
>
>How could I do that ? I use pbs_sched, no MAUI/MOAB...
>
>--
>J.A. Magallon <jamagallon()ono!com>     \               Software is
>like sex:
>                                         \         It's better when
>it's free
>_______________________________________________
>torqueusers mailing list
>torqueusers at supercluster.org
>http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list