[torqueusers] OpenPBS to Torque upgrade

Chris Samuel csamuel at vpac.org
Mon Mar 7 16:11:47 MST 2005


On Tue, 8 Mar 2005 02:30 am, Steve Traylen wrote:

> Another migration question if moving from a torque (torque-1.0.1p6) with
> the default '--enable-rpp' to the (torque-1.2.0p1) with '--disable-rpp'
> then what are the potential pitfalls with that. I'm less worried about
> the torque upgrade itself but changing the protocol would I assume
> be significant.

Hmm, that's something I'm not sure about.   We've been upgrading from time to 
time and sometimes that's been with the system running and sometimes that's 
been when we've had electrical power work about to occur and have had to shut 
the cluster down.  I can't place when that change happened in the grand 
scheme of things so I'm not in a position to comment authoritatively.

> Should I drain running, queued jobs or both? Can I just upgrade everything
> and restart everything and will everything be happy?

I would have thought that the change in the protocol between the MOM's and the 
server will require restarting all the components, and if you have some users 
using Pete Wyckoff's mpiexec to launch parallel MPI jobs (as we do) then 
those would certainly be adversely affected by this.

However, we've always upgraded with queued jobs waiting, and the only time 
that this has bitten us was with the change to the length of the PBS job ID.

Of course, I have to disclaim all liability for this information, caveat 
emptor, batteries not included, if it breaks you get to keep both pieces, 
don't blame me if you loose all your queued jobs or your cluster develops 
emergent behaviour and takes over the world...

In short, the SuperCluster developers would be more helpful than me on 
this. ;-)

Good luck!
Chris
-- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050308/7bb65063/attachment.bin


More information about the torqueusers mailing list