[Mauiusers] Re: [torqueusers] TORQUE 2.0.0p1 Release

Steve Traylen s.traylen at rl.ac.uk
Thu Nov 10 02:24:15 MST 2005


On Wed, Nov 09, 2005 at 09:25:12AM -0700 or thereabouts, jonathan ryskamp wrote:
> Greetings,
> 
>   The next patch release of TORQUE, TORQUE 2.0.0 patch 1, is now
> available.  While this comes right on the heals of a previous release,
> this latest distribution contains many significant improvements
> including the following:
> 
>   qstat modifications for massive job queue support (>50,000 jobs)
>   enhanced momctl control and diagnostics

Hi

 Do you have some details on 

>   multi-server support allowing mom's to communicate with multiple
>     server daemons simultaneously

 which is presumably related to 

 add initial multi-server support for HA (CRI)

 in the CHANGELOG.

  Steve





>   faster job submission
>   fixes for resource availability, data staging, and job management
>   support for transient tmpdirs
>   improved usability and documentation
> 
>  Also, be sure to try the EXPERIMENTAL features and provide feedback. 
> See
> pbs_server_attributes(7B) for "down_on_error", "job_nanny", and
> "mom_job_sync".  These are well tested, production-ready features that
> simply require more conceptual vetting.  They are indicative of future
> directions of TORQUE development.  In essence, these features do the
> following:
> 
>   - mark compute nodes down when various system failures are detected
>   - address job deletion when compute nodes are non-responsive
>   - synchronize mom and server job state to remove stale jobs
> 
> For more detailed information, see the CHANGELOG at
> 
>  http://clusterresources.com/torquedocs/changelog.shtml 
> 
>   Work has already begun on the next release.  Currently, the
> following
> enhancements are under development:
> 
>  - improved high availability support
>  - job array support
>  - queue based scalability enhancements
>  - qstat based job completion reporting
>  - simplified installation for distributed systems
>  - data staging diagnostics
>  - queue hostlists for direct queue to node mapping
>  - import of user umask for TM* module (FNAL)
>  - the 'long-awaited' TORQUE documentation WIKI
> 
>   TORQUE is moving forward at an amazing pace in terms of both
> development
> and adoption.  Again, thanks go out to all the contributing sites. 
> Please continue to offer us your feedback.  Let us know how TORQUE can
> be made more scalable, more stable, more capable, and more user
> friendly.
> 
> Regards,
> Jonathan
> 
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-- 
Steve Traylen
s.traylen at rl.ac.uk
http://www.gridpp.ac.uk/


More information about the mauiusers mailing list