[torqueusers] Re: Epilogue script

Glen Beane glen.beane+torque at gmail.com
Tue Aug 29 09:45:14 MDT 2006


I do not believe there is a cross-platform way to search the env or a
process.  I am about to download the darwin ps source to check how
they do it for ps -e on OS X / darwin, since the usual linux method of
searching /proc obviously will not work.

My only concern is that processes most likely to be left haning around
are ones that are spawed outside of TM are also the most likely to
*not* include the PBS ID in their ENV.  So when something actually
needs to be cleaned up, there is a good chance this method won't wory
anyway.




On 8/28/06, Garrick Staples <garrick at clusterresources.com> wrote:
> Is there a cross-platform way to search the env of processes?  It seems
> like this will have to be implemented seperately for each MOM arch.
>
> On Mon, Aug 28, 2006 at 05:07:11PM -0600, Dave Jackson alleged:
> > Glen,
> >
> >   I believe there is the possibility of negative side affects but the
> > likelihood of this is immensely small.  A user would need to
> > inadvertently set a specific environment variable to a specific value to
> > have an issue.  This does not happen in the real world and if it does,
> > this feature is configurable and is off by default.
> >
> >   I also believe there are exceptional cases in which it would not work.
> > But these are not the majority.  I think we have a capability which
> > would easily and immediately benefit many sites.  While this capability
> > does not cover 100% of cases, it definitely makes things better for
> > most.  Weighing pros and cons, I think this feature is clearly worth it.
> >
> > Dave
> >
> > On Mon, 2006-08-28 at 18:49 -0400, Glen Beane wrote:
> > > I think I agree with Garrick on this one.
> > >
> > > On 8/28/06, Garrick Staples <garrick at clusterresources.com> wrote:
> > > > I'm really uncomfortable with pbs_mom killing off processes that aren't
> > > > under its control.  Even though looking for a jobid env var seems like a
> > > > reasonable assumption, I'm sure it will break someone somewhere.
> > > >
> > > > This sounds like a site-specific assumption that is easily, and sanely,
> > > > handled in epilogue.
> > > >
> > > > Perhaps this just belongs in the Wiki.
> > > >
> > > >
> > > > On Mon, Aug 28, 2006 at 11:43:15AM -0400, Andrew Keen alleged:
> > > > > Dave,
> > > > >
> > > > > This feature would be very useful to us as we often have this problem
> > > > > (although not as often since we've migrated to using OSU's mpiexec
> > > > > instead of mpirun).
> > > > >
> > > > > -Andy
> > > > >
> > > > > torqueusers-request at supercluster.org wrote:
> > > > > >
> > > > > >   1. Re: Epilogue script (Dave Jackson)
> > > > > >   2. Re: Epilogue script (Diego M. Vadell)
> > > > > >
> > > > > >
> > > > > >----------------------------------------------------------------------
> > > > > >
> > > > > >Message: 1
> > > > > >Date: Fri, 25 Aug 2006 13:13:49 -0600
> > > > > >From: Dave Jackson <jacksond at clusterresources.com>
> > > > > >Subject: Re: [torqueusers] Epilogue script
> > > > > >To: "Diego M. Vadell" <dvadell at linuxclusters.com.ar>
> > > > > >Cc: torquedev at supercluster.org, torqueusers at supercluster.org
> > > > > >Message-ID: <1156533229.10669.77.camel at koa.icluster.org>
> > > > > >Content-Type: text/plain
> > > > > >
> > > > > >Diego,
> > > > > >
> > > > > >  What would be the negatives of enabling this feature in a much more
> > > > > >integrated manner?  ie, both mother superior and sister moms have a
> > > > > >config option 'cleanup_procs = true' which if true will search the
> > > > > >process tree for processors owned by user X with a matching job id in
> > > > > >the environment.  pbs_mom could then terminate all of these processes
> > > > > >directly.  This would make this feature much easier for most sites to
> > > > > >activate.  No epilog/prolog creation, no compiling, simply set a
> > > > > >parameter.  And as you mention, it would work in both dedicated and
> > > > > >shared node operation.
> > > > > >
> > > > > >  Thoughts?
> > > > > >
> > > > > >Dave
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > torqueusers mailing list
> > > > > torqueusers at supercluster.org
> > > > > http://www.supercluster.org/mailman/listinfo/torqueusers
> > > > _______________________________________________
> > > > torqueusers mailing list
> > > > torqueusers at supercluster.org
> > > > http://www.supercluster.org/mailman/listinfo/torqueusers
> > > >
> > > _______________________________________________
> > > torqueusers mailing list
> > > torqueusers at supercluster.org
> > > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list