[torqueusers] Re: Epilogue script

Glen Beane glen.beane+torque at gmail.com
Mon Aug 28 16:49:46 MDT 2006


I think I agree with Garrick on this one.

On 8/28/06, Garrick Staples <garrick at clusterresources.com> wrote:
> I'm really uncomfortable with pbs_mom killing off processes that aren't
> under its control.  Even though looking for a jobid env var seems like a
> reasonable assumption, I'm sure it will break someone somewhere.
>
> This sounds like a site-specific assumption that is easily, and sanely,
> handled in epilogue.
>
> Perhaps this just belongs in the Wiki.
>
>
> On Mon, Aug 28, 2006 at 11:43:15AM -0400, Andrew Keen alleged:
> > Dave,
> >
> > This feature would be very useful to us as we often have this problem
> > (although not as often since we've migrated to using OSU's mpiexec
> > instead of mpirun).
> >
> > -Andy
> >
> > torqueusers-request at supercluster.org wrote:
> > >
> > >   1. Re: Epilogue script (Dave Jackson)
> > >   2. Re: Epilogue script (Diego M. Vadell)
> > >
> > >
> > >----------------------------------------------------------------------
> > >
> > >Message: 1
> > >Date: Fri, 25 Aug 2006 13:13:49 -0600
> > >From: Dave Jackson <jacksond at clusterresources.com>
> > >Subject: Re: [torqueusers] Epilogue script
> > >To: "Diego M. Vadell" <dvadell at linuxclusters.com.ar>
> > >Cc: torquedev at supercluster.org, torqueusers at supercluster.org
> > >Message-ID: <1156533229.10669.77.camel at koa.icluster.org>
> > >Content-Type: text/plain
> > >
> > >Diego,
> > >
> > >  What would be the negatives of enabling this feature in a much more
> > >integrated manner?  ie, both mother superior and sister moms have a
> > >config option 'cleanup_procs = true' which if true will search the
> > >process tree for processors owned by user X with a matching job id in
> > >the environment.  pbs_mom could then terminate all of these processes
> > >directly.  This would make this feature much easier for most sites to
> > >activate.  No epilog/prolog creation, no compiling, simply set a
> > >parameter.  And as you mention, it would work in both dedicated and
> > >shared node operation.
> > >
> > >  Thoughts?
> > >
> > >Dave
> > >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list