[torqueusers] Fwd: ncpus anyone?

Josh Bernstein jbernstein at penguincomputing.com
Tue Mar 2 10:58:41 MST 2010


I vote for maintaing ncpus. It's very helpful for embarrssingly  
parallel jobs that just need 32 CPUs but don't care where they come  
from.

-Josh

On Mar 2, 2010, at 9:53 AM, "David Beer" <dbeer at adaptivecomputing.com>  
wrote:

> Just to let everyone know, the qstat -a output has been changed to  
> read both the value stored in nodes and ncpus, using nodes when both  
> are specified.
>
>> Changing the code so that qstat -a displays correctly the number of
>> tasks with -lnodes=1:ppn=32 would be great. Then, you could also make
>>
>> sure that -lncpus=32 is a complete synonymous of -lnodes=1:ppn=32.
>
> Is this the behavior that everyone expects/hopes for? If so, we can  
> look at working on it. At the same time, TORQUE 3.0 is likely to  
> include much superior specification for how we are requesting  
> resources, which may end up including ncpus and may not. We're  
> looking to remove a lot of ambiguity and enhance capability. By the  
> way. we're still open to input as to how all that will work, but  
> maybe we'll send out some ideas shortly if nobody has any input yet.
>
> Cheers,
>
> David
>
> ----- "Michel Béland" <michel.beland at rqchp.qc.ca> wrote:
>
>> David Beer wrote:
>>
>>> So, if I understand correctly, ncpus really only works for people
>> that are running SMP or similar systems? It seems like we definitely
>> need to update our documentation as I feel it is misleading on the
>> matter. Among other things, it seems that a clarification needs to be
>> made that ncpus isn't compatible with the nodes attribute.
>>
>> It is possible to specify both. In fact, at our site we have a qsub
>> wrapper script that makes sure, among other things, that everybody
>> specifies both on our Altix systems.
>>
>>> On a related note, in the qstat -a output we have the TSK field,
>> which I believe is meant to mean task (I couldn't find anything about
>> it in the man page, the variable in the code is named tasks). I
>> noticed that in the implementation we're just writing whatever value
>> is stored in ncpus for this field. It seems like this could be made
>> more accurate by checking the nodes attribute as well and using that
>> value where it is defined, since it seems to override ncpus when both
>> are present. What are you're thoughts on this?
>>
>> I agree. This is exactly why we make sure that all the jobs have both
>>
>> resource requests. If one specifies -lnodes=1:ppn=32, the output of
>> qstat -a does not show how many cores you really use. On the other
>> hand,
>> if one specifies -lncpus=32, Torque does not create cpusets correctly
>>
>> (they always contain only processor 0). So if I specify -lncpus=32
>> -lnodes=1:ppn=32, cpusets are created correctly and qstat -a shows
>> correctly how many cores the job is using. Maui, does not have any
>> problem dealing with this job.
>>
>>
>>
>>
>> -- 
>> Michel Béland, analyste en calcul scientifique
>> michel.beland at rqchp.qc.ca
>> bureau S-250, pavillon Roger-Gaudry (principal), Université de
>> Montréal
>> téléphone : 514 343-6111 poste 3892     télécopieur : 514 343-2 
>> 155
>> RQCHP (Réseau québécois de calcul de haute performance)
>> www.rqchp.qc.ca
>
> -- 
> David Beer | Senior Software Engineer
> Adaptive Computing
>
>
> -- 
> David Beer | Senior Software Engineer
> Adaptive Computing
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list