[torquedev] Re: [torqueusers] is there an epilogue.parallel script?

Garrick Staples garrick at usc.edu
Tue Mar 21 13:17:55 MST 2006


Since noone else seems to find these changes useful, I'm just going to
apply the original epilogue.parallel and add epilogue.user.parallel with
no other changes.

On Mon, Mar 20, 2006 at 02:57:17PM -0800, Garrick Staples alleged:
> On Tue, Mar 14, 2006 at 05:46:57PM -0800, Garrick Staples alleged:
> > > I say we make parallel scripts run on all nodes, add the "prerun"
> > > script, and make sure the scripts can identify which node they are
> > > running on ($PBS_NODENUM == 0 on MS)
> > 
> > Replying to myself as usual, here's a patch that does the above.  It
> > adds prologue.prerun, adds epilogue.parallel, adds
> > epilogue.user.parallel (we forgot about that one), has MS run all
> > parallel scripts on MS, and adds $PBS_NODENUM to all pelog scripts.
> > 
> > For job launch and exiting, note that MS' parallel scripts run _after_
> > the sisters'.
> 
> Turns out, I'm not finding these changes to be all that useful.  The
> lack of $PBS_NODEFILE on sisters and during prologue.prerun, and that
> prologue.parallel has no way of knowing the hostname of MS makes these
> worthless for my purposes.
> 
> To make these useful for *me*, we'd need to add a $PBS_MSHOST for
> parallel scripts and create $PBS_NODEFILE much earlier in the process.
> But at the end of the day, it doesn't really get me anything more than I
> currently have.
> 
> I know multiple people have asked for epilogue.parallel, so that will go
> in.  epilogue.user.parallel is documented, so it should go in.
> 
> 
> But I have some questions... 
> 
> Is having parallel scripts executed on MS actually useful to anyone?  Or
> is this non-backwards compatible change just a "makes sense to me"
> thing?  You could easily duplicate it by having prologue run
> prologue.parallel.
> 
> Would parallel.prerun actually be useful to anyone?  Noone has asked for
> it, so I'm inclined to drop that idea.
> 
> How are parallel and user scripts currently used?  I can't come with any
> good reasons for them (without the other changes I mentioned above.)
> 
> -- 
> Garrick Staples, Linux/HPCC Administrator
> University of Southern California



> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev


-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20060321/1cdc7578/attachment.bin


More information about the torquedev mailing list