[torqueusers] Fwd: ncpus anyone?

Dr. Stephan Raub raub at uni-duesseldorf.de
Tue Mar 2 11:52:05 MST 2010


Hi,

 

our Institution had been using PBSPro for some while. We dumped it and are
now using Torque/Maui for several reasons, that don’t belong in this forum
(no, it was not because of money), BUT: I liked the idea of their “chunks”.
For example: as a quantum chemist I’m using TurboMole a lot. For parallel
runs with n compute processes it requires an additional master-process which
is exactly on the same node as the first compute-prozess. Alas, this master
process doesn’t need a lot of memory. So I used a statement like
select=1:ncpus=2:mem=15gb+15:ncpus=1:mem=15gb. Up to now I haven’t figured
out a equivalent statement for torque/maui.  nodes=1:ppn=2+15:ppn=1 and
pmem=8gb is not the same, as I am not able to allocate ALL memory of a node
for the job.

 

The same with Jobs using heterogeneous mpi topologies (e.g. itanium and xeon
in the same mpi topology).

 

I don’t want to say, that PBSPro is better than Torque/Maui (as we find out
the opposite in the hard way), but the pure theoretical concept of these
resource chunks was quite useful.

 

Stephan

--

---------------------------------------------------------

| | Dr. rer. nat. Stephan Raub

| | Dipl. Chem.

| | Lehrstuhl für IT-Management / ZIM

| | Heinrich-Heine-Universität Düsseldorf Universitätsstr. 1 /

| | 25.41.O2.25-2

| | 40225 Düsseldorf / Germany

| |

| | Tel: +49-211-811-3911

---------------------------------------------------------

 

Wichtiger Hinweis: Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse,
bzw. 

sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail
irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine
Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte
benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen
Dank.

 

Important Note: This e-mail may contain trade secrets or privileged,
undisclosed or otherwise confidential information. If you have received this
e-mail in error, you are hereby notified that any review, copying or
distribution of it is strictly prohibited. Please inform us immediately and
destroy the original transmittal. Thank you for your cooperation.

 

Von: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] Im Auftrag von kamil
Marcinkowski
Gesendet: Dienstag, 2. März 2010 19:26
An: Josh Bernstein
Cc: torqueusers
Betreff: Re: [torqueusers] Fwd: ncpus anyone?

 

Hello Josh

 

You should use the (procs=32) specification for parallel jobs 

that don't care where they run.

 

npus used to have 2 different and opposite meanings on 

SMPs(nodes=1:ppn=32) and clusters (nodes=32:ppn=1).

 

I vote for  defining -lncpus=32  to  -lnodes=1:ppn=32.

 

Cheers,

 

Kamil

 

 

Kamil Marcinkowski                   Westgrid System Administrator 

kamil at ualberta.ca                     University of Alberta site


 Tel.780 492-0354                     Research Computing Support   

Fax.780 492-1729                     Academic ICT  

Edmonton, Alberta, CANADA    University of Alberta           

 

 

"This communication is intended for the use of the recipient to which it is

addressed, and may contain confidential, personal, and/or privileged

information.  Please contact us immediately if you are not the intended

recipient of this communication.  If you are not the intended recipient of

this communication, do not copy, distribute, or take action on it. Any

communication received in error, or subsequent reply, should be deleted or

destroyed."

 





 

On 2010-03-02, at 10:58 AM, Josh Bernstein wrote:





I vote for maintaing ncpus. It's very helpful for embarrssingly  
parallel jobs that just need 32 CPUs but don't care where they come  
from.

-Josh

On Mar 2, 2010, at 9:53 AM, "David Beer" <dbeer at adaptivecomputing.com>  
wrote:




Just to let everyone know, the qstat -a output has been changed to  

read both the value stored in nodes and ncpus, using nodes when both  

are specified.

 

Changing the code so that qstat -a displays correctly the number of

tasks with -lnodes=1:ppn=32 would be great. Then, you could also make

 

sure that -lncpus=32 is a complete synonymous of -lnodes=1:ppn=32.

 

Is this the behavior that everyone expects/hopes for? If so, we can  

look at working on it. At the same time, TORQUE 3.0 is likely to  

include much superior specification for how we are requesting  

resources, which may end up including ncpus and may not. We're  

looking to remove a lot of ambiguity and enhance capability. By the  

way. we're still open to input as to how all that will work, but  

maybe we'll send out some ideas shortly if nobody has any input yet.

 

Cheers,

 

David

 

----- "Michel Béland" <michel.beland at rqchp.qc.ca> wrote:

 

David Beer wrote:

 

So, if I understand correctly, ncpus really only works for people

that are running SMP or similar systems? It seems like we definitely

need to update our documentation as I feel it is misleading on the

matter. Among other things, it seems that a clarification needs to be

made that ncpus isn't compatible with the nodes attribute.

 

It is possible to specify both. In fact, at our site we have a qsub

wrapper script that makes sure, among other things, that everybody

specifies both on our Altix systems.

 

On a related note, in the qstat -a output we have the TSK field,

which I believe is meant to mean task (I couldn't find anything about

it in the man page, the variable in the code is named tasks). I

noticed that in the implementation we're just writing whatever value

is stored in ncpus for this field. It seems like this could be made

more accurate by checking the nodes attribute as well and using that

value where it is defined, since it seems to override ncpus when both

are present. What are you're thoughts on this?

 

I agree. This is exactly why we make sure that all the jobs have both

 

resource requests. If one specifies -lnodes=1:ppn=32, the output of

qstat -a does not show how many cores you really use. On the other

hand,

if one specifies -lncpus=32, Torque does not create cpusets correctly

 

(they always contain only processor 0). So if I specify -lncpus=32

-lnodes=1:ppn=32, cpusets are created correctly and qstat -a shows

correctly how many cores the job is using. Maui, does not have any

problem dealing with this job.

 

 

 

 

-- 

Michel Béland, analyste en calcul scientifique

michel.beland at rqchp.qc.ca

bureau S-250, pavillon Roger-Gaudry (principal), Université de

Montréal

téléphone : 514 343-6111 poste 3892     télécopieur : 514 343-2 

155

RQCHP (Réseau québécois de calcul de haute performance)

www.rqchp.qc.ca

 

-- 

David Beer | Senior Software Engineer

Adaptive Computing

 

 

-- 

David Beer | Senior Software Engineer

Adaptive Computing

 

_______________________________________________

torqueusers mailing list

torqueusers at supercluster.org

http://www.supercluster.org/mailman/listinfo/torqueusers

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100302/e2197352/attachment-0001.html 


More information about the torqueusers mailing list