[Mauiusers] OpenPBS/MAUI Checkpointing with linux guide
csamuel at vpac.org
Sat Feb 23 20:56:59 MST 2008
----- "Itay M" <itaym.tau at gmail.com> wrote:
> Is there any guide explaining how implement checkpointing in an
> OpenPBS / MAUI environment with linux as the compute nodes?
I believe that you might find posts about getting
suspend/resume working in the archives (though we've never
used it here, so I can't vouch for if it still works).
Some codes (like NAMD) implement checkpointing themselves.
In Torque's trunk at the moment is preliminary code for
supporting the BCLR checkpointing kernel module, though
it's likely to just be for single CPU jobs at the
moment (you'll probably need one of the MPI's that
supports BCLR to get further).
Christopher Samuel - (03) 9925 4751 - Systems Manager
The Victorian Partnership for Advanced Computing
P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
More information about the mauiusers