[torquedev] nodes, procs, tpn and ncpus

"Mgr. Šimon Tóth" SimonT at mail.muni.cz
Thu Jun 10 12:55:58 MDT 2010


>>>>> I know I'm getting in on this conversation late, but here is my fantasy:
>>>>>
>>>>> nodes=X gives X number of cpus. Packed. Your job is CPU bound and you don't
>>>>> care how they are packed.
>>>>
>>>> blah.  that is overloading the meaning of nodes.  I like the new
>>>> procs=X instead. It basically means the same thing,  you get X
>>>> processors, moab seems to pack them on as few nodes as possible.
>>>> TORQUE doesn't do anything with procs yet...
>>>
>>> Nothing is overloaded. "nodes" has always translated to "vnodes" inside of
>>> torque. If you don't specify ppn, then you don't care about where your
>>> processors land. Perfectly logical. This case also covers the vast majority of
>>> jobs.
>>
>> I am with Glen: nodes=X is just an abbreviation for nodes=X:ppn=1 - it alwasy
>> has been that way. That ppn=1 means "packed" is totally counterintuitive
>> - none of our users ever understood this this way. We were actually forced
>> to set EXACTNODE because that is the syntax users expect from specifying
>> processors-per-node. This is not about what we like, but about a sensible
>> user interface that is intuitive for users. Giving a user 5 processors on
>> the same node when specifying ppn=1 is not what users expect.
> 
> Other than "nodes=X is just an abbreviation for nodes=X:ppn=1", you just
> agreed with me. nodes=X:ppn=Y should not be packed.
> 
> 
> 
>  
>>>>> nodes=X:ppn=Y gives you X unique nodes with Y cpus per machine. Not-packed.
>>
>> This has not been that way: nodes=X:ppn=Y gave you any multiple of Y cpus
>> on a node, i.e., packed (and this includes the nodes=X (= nodes=X:ppn=1)
>> case.
> 
> You agree with me again! You and your users want it the way I said, e.g.
> EXACTNODE.
> 
> 
>  
>>>>> This lets you spread IO around because you know you need it.
>>>>
>>>>
>>>>
>>>> here is what I want
>>>>
>>>> procs=X gives you X processors, user doesn't care about layout (hack that
>>>> works with Moab, should be made to work properly with pbs_sched/qrun)
>>>> nodes=X:ppn=Y gives you exactly X unique nodes with Y processors per node
>>>> nodes=X - I'm not sure about this one, but to preserve historic behavior I
>>>> think TORQUE should give you X nodes with one processor on each node (Moab
>>>> can have an option to treat it like procs=X, which is the current behavior)
>>>> _______________________________________________ torquedev mailing list
>>
>> Agreed. This is what we want as well.
> 
> So we are in agreement that "nodes=X:ppn=Y" should not be packed. Great.
> 
>  
>>> Getting torque to jive procs with nodes is a lot more work.
>>>
>>> My plan is easy, simple, and I think covers everyone's use cases.
>>
>> It does not cover our use cases. Furthermore, having ppn not mean
>> processors-per-node results in a never ending support problem.
> 
> Here you lost me. You kept agreeing with me that "nodes=X:ppn=Y" should not be
> packed, but this doesn't cover your uses?
> 
>  
>>> Everyone has always wanted "gimme X cores, anywhere". The solution is to not
>>> use EXACTNODE and "nodes=X" does what you want. But EXACTNODE breaks the
>>> "nodes=X:ppn=y" case. If we just change maui/moab to not pack jobs with ppn,
>>> then we are done.
>>
>> That is not a solution. If we not set EXACTNODE, then users who need
>> nodes=N:ppn=1 (in its very meaning, namely exactly one processor per
>> node) cannot be satisfied. And if we do set EXACTNODE, there is no way
>> (other than procs) to request N processors anywhere. This is the reason
>> why procs was introduced in the first place: so that we can set EXACTNODE
>> and satisfy both type of requests.
> 
> It is *proposed* solution. It doesn't exist today Code in maui/moab would have
> to be written.
> 
> EXACTNODE behaviour for "nodes=X:ppn=Y", but not for "nodes".
> 
> My proposal requires no changes in torque, very minor changes in maui/moab, and
> little user re-education because they already know the word "nodes".
> 
> The only place where we disagree is that you want to use "procs=X" where I want
> to use "nodes=X". I see 2 major downsides: lots of coding work in torque, and
> more confusing semantics with mixed (what does "-l nodes=X,procs=Y" mean?)

Semantics of stuff like -l nodes=X,procs=Y should be defined in the
documentation, but seriously when ANYONE writes -l nodes=4, he means
exactly what he wrote: "I want 4 nodes".

Btw. I still don't think that -l procs=X is a good idea. I would much
rather see something like #packed, or #can_pack supported in the nodespec.

Even more great would be disjunctive nodespec support (but that goes
into NP-complete even for determining if the nodespec can be satisfied).

"-l nodes=4:ppn=2:vmem=4G+2:ppn=4:vmem=16G|8:ppn=3:vmem=8G"

-- 
Mgr. Šimon Tóth

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100610/1a42a6d8/attachment-0001.bin 


More information about the torquedev mailing list