[torqueusers] Check point support

Ronny T. Lampert telecaadmin at uni.de
Tue Feb 28 07:35:33 MST 2006


Hi,

> I am currently investigating whether it is possible to have check
> pointing of jobs in torque. I do not know anything about check pointing
> in Linux. (Our SGI machine supports it). Can someone please give me some
> ideas about what is supported in torque and what software to use.
> 
> I am currently running Torque 1.2.0p6 with Maui 3.2.6p13 on Red Hat
> Enterprise 3. My machines are x86_64 Opterons.

At the moment there is no checkpoiting support neither on torque/Linux nor
in the (standard) Linux kernel. checkpoiting is an arch dependent feature in
torque for SGi only iirc.

This (BLCR) is a recent and quite functioning attempt for Linux kernel level
snapshotting. Still it won't compile on recent (as in e.g. 2.6.14) kernels:

http://mantis.lbl.gov/blcr/doc/html/BLCR_Admin_Guide.html


A quick google'ing found the following if you are willing to get your hand
on the source and start hacking.
You may have to alter your application code and centainly also torque.

http://www.checkpointing.org/main.html


Hope that helps a bit. Cheers,
Ronny


More information about the torqueusers mailing list