[torquedev] 3.0-alpha branch added to TORQUE subversion tree

Ken Nielson knielson at adaptivecomputing.com
Mon Apr 26 11:20:45 MDT 2010


Christopher Samuel wrote:
>
> On 22/04/10 11:17, Ken Nielson wrote:
>
> > Currently the two main new features are multi-mom which
> > allows more than one copy of pbs_mom to run from the same
> > node and in the same cluster.
>
> Interesting, what's the idea behind this ?
>
> Looking at the NUMA branch the two appear to be related, is
> it so that you can partition the NUMA nodes on a large SMP
> system between the different MOMs ?
>
>
The original purpose of the Multi-MOM was for testing. This was a way to 
make a cluster look larger than the available hardware allowed. So if I 
have 10 machines I can still have a 100 node cluster (or more). However, 
I believe that other uses of the Multi-MOM will come to light as people 
start to use it. For example the NUMA branch. We have partitioned the 
node boards of the SGI 4700 into individual MOMs. In this case a single 
machine with 38 board nodes looks like 38 nodes. Each mom can allocate 
cpu sets on its node board and lock the memory of the node board as 
well. This is still a work in progress. We are finding that different 
sites have different ways of using their resources. Any input from users 
on this is more than welcome.

Ken Nielson
Adaptive Computing



More information about the torquedev mailing list