[torqueusers] random node selection using maui?
wyckoff at yahoo-inc.com
Thu Feb 8 19:07:06 MST 2007
Ok. One quick question. Let's say I have a 10,000 node cluster, so 250
racks and I want 1,000 machines, 4 from each rack (preferably). Will
torque and maui perform ok with this big a scheduling constraint?
And if I tested with 25 machines but with 250 artificial properties to
constrain against, would this be a somewhat representative test at scale?
Garrick Staples wrote:
> On Thu, Feb 08, 2007 at 02:29:13PM -0800, Peter Wyckoff alleged:
>> Hi Lennart,
>> the randomization goal is to have the computation on as many different
>> physical racks as possible. This is for data locality (i.e., latency)
>> and IO bandwidth scaling for a distributed file system.
>> The other way we are looking at is, just like you proposed, having a
>> node property called rackid-XXXX where XXXX = the subnet and then when
>> doing qsubs, requesting N/#racks of each rack type as a scheduling
>> I was thinking that if there were a way to be completely random (pseudo
>> random :)), we could get the same effect.
> The 'ordered' scheduling in maui is largely based on the order in
> pbs_server's nodes file. You could just presort them there and then let
> maui work as normal.
> So the nodes file would basicly be something like this (assuming 3 racks
> with 40 nodes per rack):
> node001 rackid1
> node041 rackid2
> node081 rackid3
> node002 rackid1
> node042 rackid2
> node082 rackid3
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers