[torquedev] [torqueusers] TORQUE 3.0.0 Released
A.Kaliazin at damtp.cam.ac.uk
Wed Dec 8 17:24:20 MST 2010
Hi Ken & Co
Now you've got me confused as well. We run, as you know, the NUMA-enabled
Torque-2.6.0-snapshot branch, which, I suppose, became Torque-3.0 release - please
correct me if I am wrong here.
Or, perhaps 2.6.0 was a one-customer branch all this time?
I have never seen anyone else referring to it on this list.
So is 3.0 a direct descendant from 2.6 and I can upgrade to 3.0 right away,
or 2.6 is a precursor to 3.1 (or 4.0?) and I should wait until March?
COSMOS System Manager
University of Cambridge, UK
Ken Nielson wrote:
> On 12/08/2010 08:14 AM, Lloyd Brown wrote:
>> On 12/6/10 4:38 PM, Ken Nielson wrote:
>>> This version of TORQUE can be built to work the same as other versions
>>> of TORQUE without the NUMA option. However, we would recommend the use
>>> of TORQUE 2.5.3 if you do not require NUMA capability.
>> Ken, et al.,
>> Can you be more specific on this recommendation? We're looking at
>> upgrading from 2.4.x to 2.5.x during our downtime in early January, and
>> due to the communication protocol change from 2.x to 3.x, we're
>> wondering about upgrading all the way to 3.0.x, to make future rolling
>> upgrades easier. We don't have a big Altix or anything, but just a few
>> hundred x86_64 Linux servers. Is there a specific concern about 3.0.x
>> on non-NUMA clusters? Are there outstanding, known issues?
> I would recommend going to 2.5.x if you do not need the NUMA support.
> Like any .0 release there are several code changes which have inherent
> possibilities for problems. The 2.5.x code is more stable and tested
> than the 3.0.x branch for non-NUMA functionality. While all of the
> 2.5.x capability is in 3.0.x you will probably find more stability with
> The best way to answer the rest of your questions is to address the
> TORQUE road map.
> The next release of 2.4-fixes will be 2.4.12. 2.4-fixes will continue to
> be the stable branch for TORQUE. By stable we mean there will be no new
> features added to 2.4-fixes. Only bug fixes.
> 2.5-fixes is becoming more stable and we would recommend moving to the
> latest 2.5 release when you are ready. 2.5-fixes will continue to
> receive new features along with bug fixes. We do not call this the
> stable branch simply because we may add feature changes. However, the
> code base itself has proven to be pretty reliable.
> We will be adding GPU support starting with 2.5.4 which we hope to
> release this month along with Moab 6. Moab 6 also has GPU support that
> will work with TORQUE.
> In March we hope to release TORQUE 3.1.
> The major thrust of TORQUE 3.1 is scalability. Some of the things we
> will be doing to improve scalability are as follows:
> * Create a multi-threaded TORQUE
> * Use a hierarchical job launch (job radix)
> * Improve mom-to-mom and mom-to-server communications to reduce
> traffic needed to keep the server and moms up to date on the state
> of the cluster
> Because of the chattiness of TORQUE in doing updates from the mom's to
> the server we may need to change how this is done. This may make 3.1
> INCOMPATIBLE with all previous versions of TORQUE. So moving to 3.0.0 to
> make upgrading easier to 3.1 may not make a difference.
> Trunk currently has changes for multi-threading if anyone wants to check
> it out. Any recommendations for improvements or reports of problems are
> welcomed and encouraged.
> To summarize, we recommend that unless you need NUMA support that you
> continue to use or upgrade to version 2.5.x. When version 3.1.0 is
> released we will start encouraging all users to upgrade as it becomes
> more stable.
> Let me know if you have more questions.
> torquedev mailing list
> torquedev at supercluster.org
More information about the torquedev