[torquedev] [torqueusers] TORQUE 3.0.0 Released
glen.beane at gmail.com
Wed Dec 8 10:45:00 MST 2010
On Wed, Dec 8, 2010 at 12:34 PM, Ken Nielson
<knielson at adaptivecomputing.com> wrote:
> On 12/08/2010 08:14 AM, Lloyd Brown wrote:
> On 12/6/10 4:38 PM, Ken Nielson wrote:
> This version of TORQUE can be built to work the same as other versions
> of TORQUE without the NUMA option. However, we would recommend the use
> of TORQUE 2.5.3 if you do not require NUMA capability.
> Ken, et al.,
> Can you be more specific on this recommendation? We're looking at
> upgrading from 2.4.x to 2.5.x during our downtime in early January, and
> due to the communication protocol change from 2.x to 3.x, we're
> wondering about upgrading all the way to 3.0.x, to make future rolling
> upgrades easier. We don't have a big Altix or anything, but just a few
> hundred x86_64 Linux servers. Is there a specific concern about 3.0.x
> on non-NUMA clusters? Are there outstanding, known issues?
> I would recommend going to 2.5.x if you do not need the NUMA support. Like
> any .0 release there are several code changes which have inherent
> possibilities for problems. The 2.5.x code is more stable and tested than
> the 3.0.x branch for non-NUMA functionality. While all of the 2.5.x
> capability is in 3.0.x you will probably find more stability with 2.5.x.
> The best way to answer the rest of your questions is to address the TORQUE
> road map.
> The next release of 2.4-fixes will be 2.4.12. 2.4-fixes will continue to be
> the stable branch for TORQUE. By stable we mean there will be no new
> features added to 2.4-fixes. Only bug fixes.
> 2.5-fixes is becoming more stable and we would recommend moving to the
> latest 2.5 release when you are ready. 2.5-fixes will continue to receive
> new features along with bug fixes. We do not call this the stable branch
> simply because we may add feature changes. However, the code base itself
> has proven to be pretty reliable.
> We will be adding GPU support starting with 2.5.4 which we hope to release
> this month along with Moab 6. Moab 6 also has GPU support that will work
> with TORQUE.
> In March we hope to release TORQUE 3.1.
> The major thrust of TORQUE 3.1 is scalability. Some of the things we will be
> doing to improve scalability are as follows:
> Create a multi-threaded TORQUE
> Use a hierarchical job launch (job radix)
> Improve mom-to-mom and mom-to-server communications to reduce traffic needed
> to keep the server and moms up to date on the state of the cluster
> Because of the chattiness of TORQUE in doing updates from the mom's to the
> server we may need to change how this is done. This may make 3.1
> INCOMPATIBLE with all previous versions of TORQUE. So moving to 3.0.0 to
> make upgrading easier to 3.1 may not make a difference.
Hey Ken, thanks for laying out some of the road map. My only comment
right now is that if TORQUE 3.1 is incompatible with 3.0 then
traditionally we would call it 4.0. Currently our standard is that
any incompatible change in protocol results in a major version bump.
More information about the torquedev