[torquedev] torque+blcr+openmpi

Eric Roman ERoman at lbl.gov
Mon Jun 28 12:08:35 MDT 2010


I worked on this a while ago, but it's been a long standing todo item to get
everything to work properly.

Can you tell me what your scripts do?

Can you restart the application manually from the context file you created?
(With cr_restart?)

Tormally, torque tries to checkpoint the shell it spawned assoc'd with the job
using cr_checkpoint (--tree) to capture all of the children, including the
mpirun and the MPI ranks.  Last time I checked, mpirun wouldn't respond to a
cr_checkpoint.  (I think it omitted itself from the checkpoint, but I don't
remember).  openmpi required a user to invoke ompi-checkpoint to checkpoint an
app, and ompi-restart to bring the app back, but torque wants to use
cr_checkpoint and cr_restart on the context file.  So, I needed to wrap
the original openmpi mpirun with another program that would intercept the
checkpoint signals.

Part of the problem is that openmpi puts some of the MPI rank into the same
process tree (or session) as the mpirun, and this messes everything up.  I
left off at the point where I needed to write startup code to ensure that
the ranks were in a separate process tree from the mpirun.  (The way things
are implemented right now, the checkpoint deadlocks, so we need to break
one of the dependencies to fix it.)

The root issue is a little bit messy.  Those checkpoint/restart scripts
need root privileges to open the context file.  Those scripts need to open
the context file (as root), and then call setuid() to change into the user,
making sure that they pass the context file as a file descriptor to 
cr_checkpoint and cr_restart.

I do want to go in and fix all of this.  Right now I'm trying to get BLCR to
work with compressed context files, and chasing a bug with using it on
the 2.6.33 kernel.


On Mon, Jun 28, 2010 at 09:43:14AM +0200, Danny Sternkopf wrote:
> Hi,
> maybe someone here can comments on this.
> Regards,
> Danny
> -------- Original Message --------
> Subject: Re: [torqueusers] torque+blcr+openmpi
> Date: Fri, 25 Jun 2010 16:58:59 +0200
> From: Danny Sternkopf <dsternkopf at hpce.nec.com>
> Reply-To: dsternkopf at hpce.nec.com
> Organization: NEC Deutschland GmbH
> To: torqueusers at supercluster.org
> Hi,
> any news about this? I have the following setup:
> o torque 2.4.8
> o openmpi 1.4.2
> o blcr 0.8.2
> The checkpoint/restart scripts from Torque's contrib/blcr work for
> single node application without MPI. I created new scripts for OpenMPI
> applications. The checkpoint works, but the release does not. The issue
> might be that ompi-checkpoint writes a directory including checkpoint
> files for each process plus metadata and Torque expects one single
> checkpoint file. Any experiences?
> Btw another issue is that the checkpoint/restart scripts run as root.
> ompi-checkpoint doesn't allow that root can checkpoint user jobs. So you
> have to run the ompi-checkpoint as user. The restart script of course
> needs this as well to restart process under the corresponding user id.
> Furthermore any comments to handle MPI and single process applications
> with same checkpoint/restart scripts?
> Regards,
> Danny
> ---
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev

More information about the torquedev mailing list