[torquedev] proposed change in directory structure

Glen Beane glen.beane at gmail.com
Fri Jul 25 21:05:37 MDT 2008


On Wed, Jul 16, 2008 at 2:58 AM, Chris Samuel <csamuel at vpac.org> wrote:

>
> ----- "Glen Beane" <glen.beane at gmail.com> wrote:
>
> > I've been working on some changes in trunk that transfer
> > the .OU and .ER spool files from pbs_mom back to pbs_server.
> [...]
> > Are there any objections to this change in trunk?
>
> I'm happy as long as there's a way to ensure that
> this never happens on a particular cluster (maybe
> by a server configuration setting or by being a
> configure option).
>
> Actually, I'm presuming this is predicated on the
> blcr configure option being enabled ?   That might
> be enough for us.


it only happens if the job has the checkpoint_file attribute set to the name
of a checkpoint_file,  so if you aren't going to use blcr checkpoint/restart
then you don't need to worry.  We still could put in either a qmgr setting
or a compile time setting to turn of this feature (so .OU and .ER files
won't be kept for complete jobs even if the job still has the potential to
be restarted)


and  an update:  I'm about to check in the rest of the code to do this (most
of the code has already been checked in).  It does make the assumption that
if pbs_server and the pbs_mom involved are on the same host then they will
share a spool directory.  I've noticed quite a few places where that
assumption is made, so in order for me to get everything working with
separate spool directories for mom and server on the same host it would
actually be a lot more work than it was to get everything working with
sharing the same spool directory... so for now if you use blcr and you have
a mom and server running on the same host then you need to have them sharing
a spool directory in order for the .OU and .ER return to work properly on
that host
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torquedev/attachments/20080725/f97b44d8/attachment.html


More information about the torquedev mailing list