[torqueusers] need help with 1.2.0p1 snapshot testing

Chris Samuel csamuel at vpac.org
Tue Feb 15 15:47:12 MST 2005


On Tue, 15 Feb 2005 05:29 pm, Garrick Staples wrote:

> Queued, yes.  Running, no.  Earlier versions don't save the necessary info
> to properly preserve the tm state.  The fixes in the new code have as much
> to do with _saving_ as _recovery_.

Ah, no I understood that, it's just that (at the moment) only a couple of our 
users are running with mpiexec and I thought if I could upgrade when they 
weren't running jobs and not affect the non-mpiexec ones (many are 
uniprocessor jobs, some happily run for over 3 months) then that would be 
great.

So I was wondering if I restarted a mom for one of those, or someone using the 
MPICH mpirun (using ssh instead of rsh) then would it affect those jobs ?

cheers!
Chris
-- 
 Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050216/ff20a836/attachment.bin


More information about the torqueusers mailing list