[torquedev] New job_must_report pbs_server feature

Josh Butikofer josh at clusterresources.com
Wed May 6 10:48:22 MDT 2009


Chris Samuel wrote:
> ----- "Al Taufer" <ataufer at clusterresources.com> wrote:
> 
>> We would like to add a new feature into the 2.3 and 2.4 Torque
>> branches.
> 
> Umm, do new features need to go into 2.3.x ?

Yes, we've decided that some features, if they do not change default behavior 
and are only activated via explicit configuration, can be added into stable 
branches (for the time being, this means 2.3.x). The reason for this is some 
users/customers need a stable branch, but also require added enhancements 
without making the leap or waiting for the release of 2.4.

We are trying to get 2.4 ready for release and once that is done we will try to 
do a better job of releasing minor revisions more regularly, allowing us to add 
new features only to the next 2.x release.

Over the past year to two years it seems to me that minor revisions have been 
treated like major revisions which is something we want to move away from.

>> It would allow the server to be configured so that jobs
>> must report to the scheduler and be confirmed by the
>> scheduler before they are cleaned out. 
> 
> Maybe it's just because it's late here and I've had
> a long day, but I'm not quite sure what this addresses
> or what the effect is ?   Could you illustrate it with
> an example or two please ?

There are use cases where the scheduler needs to know about the completion 
status of a job. In some system critical environments Moab needs to perform some 
action based on the completion status. If Moab crashes or is shut down, however, 
and the keep_completed is set to 0 (or a small number), the job could be purged 
from TORQUE's memory before the scheduler can deduce how the job ran. This 
causes problems. There are actually several customers/users who require this 
kind of tight monitoring of jobs from their scheduler.

--Josh B.


More information about the torquedev mailing list