[torqueusers] NUMA general use
dbeer at adaptivecomputing.com
Mon Apr 26 11:44:46 MDT 2010
We are currently developing a temporary branch (NUMA) to support NUMA systems. (NUMA stands for Non-Uniform Memory Access) We intend for this branch to be temporary and want to merge it back into the main tree with the intention of releasing it in 3.0, when 3.0 comes out.
In preparation for this, we want to make sure that it is of general use for people running on NUMA systems. The site that is currently running the branch doesn't run MPI jobs on their NUMA system, and they are of the opinion that running an MPI job on this kind of system defeats the purpose of the system. This makes sense to me, and is consistent with the research I have done, but we want to ask to find out how people are using these systems to make sure it is a useful part of TORQUE.
Right now, as this site isn't interested in MPI jobs, a node that requests 5 nodeboards runs from the first nodeboard and has a cpuset including all of the processors and memory where they are set to run. Do we have TORQUE users out there that are using NUMA systems differently?
If my questions don't make sense, please feel free to just describe how you are using your system in as much detail (or as little) as you like. We want to make sure this is a feature that is generally useful, not just for a specific case or two.
David Beer | Senior Software Engineer
More information about the torqueusers