[torqueusers] Enable checkpoint/restart on a per-queue basis?

Glen Beane glen.beane at gmail.com
Wed Jun 10 17:15:49 MDT 2009


On Mon, Jun 8, 2009 at 12:55 PM, Troy Baer<tbaer at utk.edu> wrote:
> Hello all,
>
> I've been experimenting with BLCR-based checkpoint/restart on a couple
> different systems, and I was sort of surprised that there doesn't seem
> to be any way I could find to set things up in the queue attributes so
> that jobs in a particular queue are checkpointable by default.  I had
> expected there to be a qmgr keyword to do this, like:
>
> set queue foo checkpoints_enabled = True
>
> Alas, I can find nothing of the sort.
>
> Now admittedly I *could* do this in the submit filter by prepending a
> "#PBS -c enable" line to the headers of any jobs that fit certain
> parameters, but it does seem a bit silly that I can set the checkpoint
> directory on a per-queue basis but not whether jobs in a particular
> queue default to being checkpointable.
>
> Am I alone in this regard, or would others find this useful?


Hi Troy,

I think I could add this feature for you without too much trouble.
I'm going to add it into the new TORQUE bugzilla as an enhancement.


More information about the torqueusers mailing list