[torqueusers] OpenPBS to Torque upgrade
Chris Samuel
csamuel at vpac.org
Mon Mar 7 16:11:47 MST 2005
On Tue, 8 Mar 2005 02:30 am, Steve Traylen wrote:
> Another migration question if moving from a torque (torque-1.0.1p6) with
> the default '--enable-rpp' to the (torque-1.2.0p1) with '--disable-rpp'
> then what are the potential pitfalls with that. I'm less worried about
> the torque upgrade itself but changing the protocol would I assume
> be significant.
Hmm, that's something I'm not sure about. We've been upgrading from time to
time and sometimes that's been with the system running and sometimes that's
been when we've had electrical power work about to occur and have had to shut
the cluster down. I can't place when that change happened in the grand
scheme of things so I'm not in a position to comment authoritatively.
> Should I drain running, queued jobs or both? Can I just upgrade everything
> and restart everything and will everything be happy?
I would have thought that the change in the protocol between the MOM's and the
server will require restarting all the components, and if you have some users
using Pete Wyckoff's mpiexec to launch parallel MPI jobs (as we do) then
those would certainly be adversely affected by this.
However, we've always upgraded with queued jobs waiting, and the only time
that this has bitten us was with the change to the length of the PBS job ID.
Of course, I have to disclaim all liability for this information, caveat
emptor, batteries not included, if it breaks you get to keep both pieces,
don't blame me if you loose all your queued jobs or your cluster develops
emergent behaviour and takes over the world...
In short, the SuperCluster developers would be more helpful than me on
this. ;-)
Good luck!
Chris
--
Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050308/7bb65063/attachment.bin
More information about the torqueusers
mailing list