[torquedev] 3.0-alpha branch added to TORQUE subversion tree

David Singleton David.Singleton at anu.edu.au
Mon Apr 26 18:12:46 MDT 2010


On 04/27/2010 09:07 AM, David Beer wrote:
> ----- Original Message -----
>> On 04/27/2010 03:28 AM, Ken Nielson wrote:
>>> The last part of my last response was not as clear as I wanted.
>>>
>>> We definitely want to get user response about the Multi-MOM, but
>>> what I
>>> really would like to get input for is how people are using their
>>> NUMA systems. How do they lock down nodes and memory etc.
>>>
>>> Ken Nielson
>>>
>>
>> Isn't a MOM per node-board or any other subset of an SMP a restriction
>> on shared memory job sizes? Why does it help in the NUMA case to have
>> a MOM per node board? Do these MOM's segregate NUMA node memory as
>> well.
>>
>
> No, our implementation does not restrict the job size of a shared-memory job. This is only restricted by the amount of memory in the system.
>

So I'm confused.  If I run a 32cpu shared memory job, do multiple MOMs
get the job?  Hopefully just one?  And that one allocates a cpuset
"underneath" other MOMs on other nodeboards?  If MOMs can overlap in
the cpus they "manage", what is the value of multiple MOMs?

David


More information about the torquedev mailing list