[torquedev] New job_must_report pbs_server feature
Josh Butikofer
josh at clusterresources.com
Wed May 6 10:48:22 MDT 2009
Chris Samuel wrote:
> ----- "Al Taufer" <ataufer at clusterresources.com> wrote:
>
>> We would like to add a new feature into the 2.3 and 2.4 Torque
>> branches.
>
> Umm, do new features need to go into 2.3.x ?
Yes, we've decided that some features, if they do not change default behavior
and are only activated via explicit configuration, can be added into stable
branches (for the time being, this means 2.3.x). The reason for this is some
users/customers need a stable branch, but also require added enhancements
without making the leap or waiting for the release of 2.4.
We are trying to get 2.4 ready for release and once that is done we will try to
do a better job of releasing minor revisions more regularly, allowing us to add
new features only to the next 2.x release.
Over the past year to two years it seems to me that minor revisions have been
treated like major revisions which is something we want to move away from.
>> It would allow the server to be configured so that jobs
>> must report to the scheduler and be confirmed by the
>> scheduler before they are cleaned out.
>
> Maybe it's just because it's late here and I've had
> a long day, but I'm not quite sure what this addresses
> or what the effect is ? Could you illustrate it with
> an example or two please ?
There are use cases where the scheduler needs to know about the completion
status of a job. In some system critical environments Moab needs to perform some
action based on the completion status. If Moab crashes or is shut down, however,
and the keep_completed is set to 0 (or a small number), the job could be purged
from TORQUE's memory before the scheduler can deduce how the job ran. This
causes problems. There are actually several customers/users who require this
kind of tight monitoring of jobs from their scheduler.
--Josh B.
More information about the torquedev
mailing list