[torqueusers] Fwd: ncpus anyone?

kamil Marcinkowski kamil at ualberta.ca
Tue Mar 2 16:02:52 MST 2010


Hello

Currently we are lacking the ability to request appropriate 
resources for a shared-memory, multi-threaded program.

We would like to be able to request one process with n 
(cores/threads/cpus)  with any mem, pmem, vmem, or pvmem 
specification apply to that one process.

ex) nodes=1:ppn=8:vmempn=16gb:single_process

1 node, 8 processors per node , 16gb virtual memory per node, applied to a single process

Thanks 

Kamil


Kamil Marcinkowski                   Westgrid System Administrator 
kamil at ualberta.ca                     University of Alberta site                     
 Tel.780 492-0354                     Research Computing Support   
Fax.780 492-1729                     Academic ICT  
Edmonton, Alberta, CANADA    University of Alberta           


"This communication is intended for the use of the recipient to which it is
addressed, and may contain confidential, personal, and/or privileged
information.  Please contact us immediately if you are not the intended
recipient of this communication.  If you are not the intended recipient of
this communication, do not copy, distribute, or take action on it. Any
communication received in error, or subsequent reply, should be deleted or
destroyed."



On 2010-03-02, at 12:12 PM, David Beer wrote:

> The whole procs vs. ncpus thing is one of the ambiguities we're hoping to get rid of. At least to me, I don't read procs and think it can be spread out and read ncpus and think that it has to be on one node. 
> 
> We also are looking at moving towards some kind of node-agnostic form of specification that would allow similar like chunks where you would specify a desired amount of cores and memory that are advantageously close together, without worrying about nodes. This seems like a more flexible way to specify things, and that it would be more compatible with SMP and similar systems. 
> 
> Another enhancement that we're considering is locking memory close to the CPUs (where relevant). I know that on some systems (NUMA, for example) this is necessary in order to run jobs efficiently.
> 
> David
> 
> 
> ----- "Dr. Stephan Raub" <raub at uni-duesseldorf.de> wrote:
> 
>> Hi,
>> 
>> 
>> 
>> our Institution had been using PBSPro for some while. We dumped it and
>> are now using Torque/Maui for several reasons, that don’t belong in
>> this forum (no, it was not because of money), BUT: I liked the idea of
>> their “chunks”. For example: as a quantum chemist I’m using TurboMole
>> a lot. For parallel runs with n compute processes it requires an
>> additional master-process which is exactly on the same node as the
>> first compute-prozess. Alas, this master process doesn’t need a lot of
>> memory. So I used a statement like
>> select=1:ncpus=2:mem=15gb+15:ncpus=1:mem=15gb. Up to now I haven’t
>> figured out a equivalent statement for torque/maui.
>> nodes=1:ppn=2+15:ppn=1 and pmem=8gb is not the same, as I am not able
>> to allocate ALL memory of a node for the job.
>> 
>> 
>> 
>> The same with Jobs using heterogeneous mpi topologies (e.g. itanium
>> and xeon in the same mpi topology).
>> 
>> 
>> 
>> I don’t want to say, that PBSPro is better than Torque/Maui (as we
>> find out the opposite in the hard way), but the pure theoretical
>> concept of these resource chunks was quite useful.
>> 
>> 
>> 
>> Stephan
>> 
>> 
>> --
>> 
>> ---------------------------------------------------------
>> 
>> | | Dr. rer. nat. Stephan Raub
>> 
>> | | Dipl. Chem.
>> 
>> | | Lehrstuhl für IT-Management / ZIM
>> 
>> | | Heinrich-Heine-Universität Düsseldorf Universitätsstr. 1 /
>> 
>> | | 25.41.O2.25-2
>> 
>> | | 40225 Düsseldorf / Germany
>> 
>> | |
>> 
>> | | Tel: +49-211-811-3911
>> 
>> ---------------------------------------------------------
>> 
>> 
>> 
>> Wichtiger Hinweis: Diese E-Mail kann Betriebs- oder
>> Geschäftsgeheimnisse, bzw.
>> 
>> sonstige vertrauliche Informationen enthalten. Sollten Sie diese
>> E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des
>> Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich
>> untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die
>> empfangene E-Mail. Vielen Dank.
>> 
>> 
>> 
>> Important Note: This e-mail may contain trade secrets or privileged,
>> undisclosed or otherwise confidential information. If you have
>> received this e-mail in error, you are hereby notified that any
>> review, copying or distribution of it is strictly prohibited. Please
>> inform us immediately and destroy the original transmittal. Thank you
>> for your cooperation.
>> 
>> 
>> 
>> 
>> 
>> 
>> Von: torqueusers-bounces at supercluster.org
>> [mailto:torqueusers-bounces at supercluster.org] Im Auftrag von kamil
>> Marcinkowski
>> Gesendet: Dienstag, 2. März 2010 19:26
>> An: Josh Bernstein
>> Cc: torqueusers
>> Betreff: Re: [torqueusers] Fwd: ncpus anyone?
>> 
>> 
>> 
>> 
>> Hello Josh
>> 
>> 
>> 
>> 
>> 
>> You should use the (procs=32) specification for parallel jobs
>> 
>> 
>> that don't care where they run.
>> 
>> 
>> 
>> 
>> 
>> npus used to have 2 different and opposite meanings on
>> 
>> 
>> SMPs(nodes=1:ppn=32) and clusters (nodes=32:ppn=1).
>> 
>> 
>> 
>> 
>> 
>> I vote for defining -lncpus=32 to -lnodes=1:ppn=32.
>> 
>> 
>> 
>> 
>> 
>> Cheers,
>> 
>> 
>> 
>> 
>> 
>> Kamil
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Kamil Marcinkowski Westgrid System Administrator
>> 
>> 
>> kamil at ualberta.ca University of Alberta site
>> 
>> 
>> Tel.780 492-0354 Research Computing Support
>> 
>> 
>> Fax.780 492-1729 Academic ICT
>> 
>> 
>> Edmonton, Alberta, CANADA University of Alberta
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> "This communication is intended for the use of the recipient to which
>> it is
>> 
>> 
>> addressed, and may contain confidential, personal, and/or privileged
>> 
>> 
>> information. Please contact us immediately if you are not the intended
>> 
>> 
>> recipient of this communication. If you are not the intended recipient
>> of
>> 
>> 
>> this communication, do not copy, distribute, or take action on it. Any
>> 
>> 
>> communication received in error, or subsequent reply, should be
>> deleted or
>> 
>> 
>> destroyed."
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On 2010-03-02, at 10:58 AM, Josh Bernstein wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> I vote for maintaing ncpus. It's very helpful for embarrssingly
>> parallel jobs that just need 32 CPUs but don't care where they come
>> from.
>> 
>> -Josh
>> 
>> On Mar 2, 2010, at 9:53 AM, "David Beer" < dbeer at adaptivecomputing.com
>>> 
>> wrote:
>> 
>> 
>> 
>> 
>> Just to let everyone know, the qstat -a output has been changed to
>> 
>> 
>> 
>> read both the value stored in nodes and ncpus, using nodes when both
>> 
>> 
>> 
>> are specified.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Changing the code so that qstat -a displays correctly the number of
>> 
>> 
>> 
>> 
>> 
>> tasks with -lnodes=1:ppn=32 would be great. Then, you could also make
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> sure that -lncpus=32 is a complete synonymous of -lnodes=1:ppn=32.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Is this the behavior that everyone expects/hopes for? If so, we can
>> 
>> 
>> 
>> look at working on it. At the same time, TORQUE 3.0 is likely to
>> 
>> 
>> 
>> include much superior specification for how we are requesting
>> 
>> 
>> 
>> resources, which may end up including ncpus and may not. We're
>> 
>> 
>> 
>> looking to remove a lot of ambiguity and enhance capability. By the
>> 
>> 
>> 
>> way. we're still open to input as to how all that will work, but
>> 
>> 
>> 
>> maybe we'll send out some ideas shortly if nobody has any input yet.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Cheers,
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> David
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> ----- "Michel Béland" < michel.beland at rqchp.qc.ca > wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> David Beer wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> So, if I understand correctly, ncpus really only works for people
>> 
>> 
>> 
>> 
>> 
>> that are running SMP or similar systems? It seems like we definitely
>> 
>> 
>> 
>> 
>> 
>> need to update our documentation as I feel it is misleading on the
>> 
>> 
>> 
>> 
>> 
>> matter. Among other things, it seems that a clarification needs to be
>> 
>> 
>> 
>> 
>> 
>> made that ncpus isn't compatible with the nodes attribute.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> It is possible to specify both. In fact, at our site we have a qsub
>> 
>> 
>> 
>> 
>> 
>> wrapper script that makes sure, among other things, that everybody
>> 
>> 
>> 
>> 
>> 
>> specifies both on our Altix systems.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On a related note, in the qstat -a output we have the TSK field,
>> 
>> 
>> 
>> 
>> 
>> which I believe is meant to mean task (I couldn't find anything about
>> 
>> 
>> 
>> 
>> 
>> it in the man page, the variable in the code is named tasks). I
>> 
>> 
>> 
>> 
>> 
>> noticed that in the implementation we're just writing whatever value
>> 
>> 
>> 
>> 
>> 
>> is stored in ncpus for this field. It seems like this could be made
>> 
>> 
>> 
>> 
>> 
>> more accurate by checking the nodes attribute as well and using that
>> 
>> 
>> 
>> 
>> 
>> value where it is defined, since it seems to override ncpus when both
>> 
>> 
>> 
>> 
>> 
>> are present. What are you're thoughts on this?
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> I agree. This is exactly why we make sure that all the jobs have both
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> resource requests. If one specifies -lnodes=1:ppn=32, the output of
>> 
>> 
>> 
>> 
>> 
>> qstat -a does not show how many cores you really use. On the other
>> 
>> 
>> 
>> 
>> 
>> hand,
>> 
>> 
>> 
>> 
>> 
>> if one specifies -lncpus=32, Torque does not create cpusets correctly
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> (they always contain only processor 0). So if I specify -lncpus=32
>> 
>> 
>> 
>> 
>> 
>> -lnodes=1:ppn=32, cpusets are created correctly and qstat -a shows
>> 
>> 
>> 
>> 
>> 
>> correctly how many cores the job is using. Maui, does not have any
>> 
>> 
>> 
>> 
>> 
>> problem dealing with this job.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> 
>> 
>> 
>> 
>> Michel Béland, analyste en calcul scientifique
>> 
>> 
>> 
>> 
>> 
>> michel.beland at rqchp.qc.ca
>> 
>> 
>> 
>> 
>> 
>> bureau S-250, pavillon Roger-Gaudry (principal), Université de
>> 
>> 
>> 
>> 
>> 
>> Montréal
>> 
>> 
>> 
>> 
>> 
>> téléphone : 514 343-6111 poste 3892 télécopieur : 514 343-2
>> 
>> 
>> 
>> 
>> 
>> 155
>> 
>> 
>> 
>> 
>> 
>> RQCHP (Réseau québécois de calcul de haute performance)
>> 
>> 
>> 
>> 
>> 
>> www.rqchp.qc.ca
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> 
>> 
>> David Beer | Senior Software Engineer
>> 
>> 
>> 
>> Adaptive Computing
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> 
>> 
>> David Beer | Senior Software Engineer
>> 
>> 
>> 
>> Adaptive Computing
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> 
>> 
>> 
>> torqueusers mailing list
>> 
>> 
>> 
>> torqueusers at supercluster.org
>> 
>> 
>> 
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>> 
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>> 
>> 
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> -- 
> David Beer | Senior Software Engineer
> Adaptive Computing
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100302/d2e04dc5/attachment-0001.html 


More information about the torqueusers mailing list