[torqueusers] altix cpu set support
georg.hager at rrze.uni-erlangen.de
Fri Jan 26 13:13:37 MST 2007
> Yes our test is a collection of codes, the system we are evaluating
> has 16 useable cores, For example if i run a 8 cpu without placement
> (no dplace) vs with dplace i see no runtime change over the few
> hour long run. I will try and force the system to push processes
This will change if the machine is full of different user jobs.
Imagine a parallel job ending up on all the even-numbered CPUs while
other jobs occupy the odd-numbered CPUs. Since usually two CPUs share
a common memory bus, performance becomes unpredictable, at least for
memory-bound problems (what else would you run on an altix?). PBSPro
has the nice "shared cpuset" feature that tackles this problem by
assigning only full nodes. That way, a parallel job always has all
CPUs in a locality domain and performance stays predictable.
Another point is that you effectively can't use dplace without
cpusets because there are no logical CPU numbers without a cpuset.
Inside a cpuset that is exclusive to your job, dplace works fine
on logical CPU numbers starting from zero.
Dr. Georg Hager | Email:
Regionales Rechenzentrum | Georg.Hager at rrze.uni-erlangen.de
Erlangen, HPC Services | Tel.: (+49)9131/85-28973
Martensstr. 1, 1.020 | Fax: (+49)9131/302941
D-91058 Erlangen, Germany | http://www.rrze.uni-erlangen.de/hpc
More information about the torqueusers