[torqueusers] Question about checkpoint for MPI
samuel at unimelb.edu.au
Wed Dec 5 19:37:02 MST 2012
-----BEGIN PGP SIGNED MESSAGE-----
On 06/12/12 08:31, Andrus, Brian Contractor wrote:
> Well, That is sad news.
> What are the options out there for checkpoint/restart of a job
It's worth noting that the kernel community is following a completely
different checkpoint/restart path, that of the OpenVZ developers
"heckpoint/restore in user space" project (CRIU).
You can read more about it here:
The CRIU website is here:
It will also be up for discussion at LCA2013 in Canberra this year
(though I won't be there).
I'd suggest it's worth bringing up on the openmpi-devel list, I must
just do that now.
Christopher Samuel Senior Systems Administrator
VLSCI - Victorian Life Sciences Computation Initiative
Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/
-----END PGP SIGNATURE-----
More information about the torqueusers