[torqueusers] TORQUE 4.0 Officially Announced

Ken Nielson knielson at adaptivecomputing.com
Mon Mar 19 09:26:40 MDT 2012


On Fri, Mar 16, 2012 at 8:26 PM, DuChene, StevenX A <
stevenx.a.duchene at intel.com> wrote:

>  It is unclear from this announcement text where hwloc has to be
> installed.****
>
> Is it just on the server or on the nodes only?****
>
> I looked in the various README files and the Release_Notes file packages
> with the sources and there is no mention of hwloc in those at all. There is
> only the one short mention in the CHANGELOG file that is even less than
> what is in the announcement below.****
>
> ** **
>
> More documentation about this would be greatly appreciated.****
>
> --****
>
> Steven DuChene
>

Steve,

Consider it done. It will be part of 4.0.1.

Ken

> ****
>
> ** **
>
> *From:* torqueusers-bounces at supercluster.org [mailto:
> torqueusers-bounces at supercluster.org] *On Behalf Of *David Beer
> *Sent:* Tuesday, March 13, 2012 12:43 PM
> *To:* Torque Users Mailing List; Torque Developers mailing list
> *Subject:* [torqueusers] TORQUE 4.0 Officially Announced****
>
> ** **
>
> All,****
>
> ** **
>
> TORQUE 4.0 is officially here! Please check out Adaptive Computing's
> official announcement here:
> http://www.adaptivecomputing.com/adaptive-computing-offers-the-next-generation-of-high-performance-computing-with-moab-hpc-suite-7-0/
> ****
>
> ** **
>
> The tarball can be downloaded from here:
> http://www.adaptivecomputing.com/resources/downloads/torque/torque-4.0.0.tar.gz
>  ****
>
> ** **
>
> We have several sites currently using 4.0 and feedback has been positive.
> These warnings are posted on the download site, but I am copying them here:
> ****
>
> ** **
>
> 1. Make sure that you have openssl-devel (RedHat based) / libssl-dev
> (Debian based) installed (the name may differ for different operating
> systems) in order to be able to build TORQUE 4.0.****
>
> 2. Make sure that you run the daemon trqauthd on machines that will be
> running client commands. NOTE: there is an init.d script for it in
> contrib/init.d/ but it needs customization (this includes Moab). One
> problem is that it has a misspelling for PBS_DAEMON - it should be
> /usr/local/sbin/trqauthd by default, not /usr/local/bin/trqauthd.****
>
> 3. Moab needs to be started or restarted after installing TORQUE 4.0 (if
> you are using Moab)****
>
> ** **
>
> Please make sure to take all normal precautions for upgrading. Another
> advisory (not on the website) is that TORQUE now uses hwloc to manage
> cpusets, meaning you will need to install hwloc on your system if it isn't
> already there and you wish to use it. It needs to be version 1.1 or higher.
> ****
>
> ** **
>
> The major features of the release are briefly described on the release,
> but the CHANGELOG for 4.0 is copied at the end of this email. ****
>
> ** **
>
> This release has undergone more testing than any previous release of
> TORQUE; to be fair, it also has more changes than any previous version of
> TORQUE. Overall, we saw very good results in our beta program and most of
> the sites using it have had good experiences. We are proud of the quality
> of this release and hope that you'll try it out and let us know how it
> works for you.****
>
> ** **
>
> -- ****
>
> David Beer | Software Engineer****
>
> Adaptive Computing****
>
> ** **
>
> ** **
>
> 4.0.0****
>
>   e - make a threadpool for TORQUE server. The number of threads is****
>
>       customizable using min_threads and max_threads, and idle time before
> ****
>
>       exiting can be set using thread_idle_seconds.****
>
>   e - make pbs_server multi-threaded in order to increase responsiveness
> and scalability.****
>
>   e - remove the forking from pbs_server running a job, the thread
> handling the request just****
>
>       waits until the job is run.****
>
>   e - change qdel to simply send qdel all - previously this was executed
> by a qstat and a qdel****
>
>       of every individual job****
>
>   e - no longer fork to send mail, just use a thread****
>
>   e - use hwloc as the backbone for cpuset support in TORQUE (contributed
> by Dr. Bernd Kallies)****
>
>   e - add the boolean variable $use_smt to mom config. If set to false,
> this skips logical****
>
>       cores and uses only physical cores for the job. It is true by
> default.****
>
>       (contributed by Dr. Bernd Kallies)****
>
>   n - with the multi-threading the pbs_server -t create and -t cold
> commands could no longer****
>
>       ask for user input from the command line. The call to ask if the
> user wants to continue****
>
>       was moved higher in the initialization process and some of the
> wording changed to****
>
>       reflect what is now happening.****
>
>   e - if cpusets are configured but aren't found and cannot be mounted,
> pbs_mom will now fail to****
>
>       start instead of failing silently.****
>
>   e - Change node_spec from an N^2 (but average 5N) algorithm to an N
> algorithm with respect****
>
>       to nodes. We only loop over each node once at a maximum.****
>
>   e - Abandon pbs_iff in favor of trqauthd. trqauthd is a daemon to be
> started once that can****
>
>       perform pbs_iff's functionality, increasing speed and enabling
> future security****
>
>       enhancements****
>
>   e - add mom_hierarchy functionality for reporting. The file is located in
> ****
>
>       <TORQUE_HOME>/server_priv/mom_hierarchy, and can be written to tell
> moms to send****
>
>       updates to other moms who will pass them on to pbs_server. See docs
> for details****
>
>   e - add a unit testing framework (check). It is compiled with
> --with-check and tests****
>
>       are executed using make check. The framework is complete but not
> many tests have****
>
>       been written as of yet.****
>
>   e - Mom rejection messages are now passed back to qrun when possible****
>
>   e - Added the option -c for startup. By default, the server attempts to
> send the mom****
>
>       hierarchy file to all moms on startup, and all moms update the
> server and request****
>
>       the hierarchy file. If both are trying to do this at once, it can
> cause a lot of****
>
>       traffic. -c tells pbs_server to wait 10 minutes to attempt to
> contact moms that****
>
>       haven't contacted it, reducing this traffic.****
>
>   e - Added mom parameter -w to reduce start times. This parameter wait to
> send it's****
>
>       first update until the server sends it the mom hierarchy file, or
> until 10****
>
>       minutes have passed. This should reduce large cluster startup times.
> ****
>
> ** **
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120319/77b2cb60/attachment-0001.html 


More information about the torqueusers mailing list