[Mauiusers] TORQUE 2.0.0p1 Release

jonathan ryskamp jryskamp at clusterresources.com
Wed Nov 9 09:25:12 MST 2005


Greetings,

  The next patch release of TORQUE, TORQUE 2.0.0 patch 1, is now
available.  While this comes right on the heals of a previous release,
this latest distribution contains many significant improvements
including the following:

  qstat modifications for massive job queue support (>50,000 jobs)
  enhanced momctl control and diagnostics
  multi-server support allowing mom's to communicate with multiple
    server daemons simultaneously
  faster job submission
  fixes for resource availability, data staging, and job management
  support for transient tmpdirs
  improved usability and documentation

 Also, be sure to try the EXPERIMENTAL features and provide feedback. 
See
pbs_server_attributes(7B) for "down_on_error", "job_nanny", and
"mom_job_sync".  These are well tested, production-ready features that
simply require more conceptual vetting.  They are indicative of future
directions of TORQUE development.  In essence, these features do the
following:

  - mark compute nodes down when various system failures are detected
  - address job deletion when compute nodes are non-responsive
  - synchronize mom and server job state to remove stale jobs

For more detailed information, see the CHANGELOG at

 http://clusterresources.com/torquedocs/changelog.shtml 

  Work has already begun on the next release.  Currently, the
following
enhancements are under development:

 - improved high availability support
 - job array support
 - queue based scalability enhancements
 - qstat based job completion reporting
 - simplified installation for distributed systems
 - data staging diagnostics
 - queue hostlists for direct queue to node mapping
 - import of user umask for TM* module (FNAL)
 - the 'long-awaited' TORQUE documentation WIKI

  TORQUE is moving forward at an amazing pace in terms of both
development
and adoption.  Again, thanks go out to all the contributing sites. 
Please continue to offer us your feedback.  Let us know how TORQUE can
be made more scalable, more stable, more capable, and more user
friendly.

Regards,
Jonathan





More information about the mauiusers mailing list