[torqueusers] mpiexec jobs got stuck

Troy Baer tbaer at utk.edu
Tue May 12 13:40:35 MDT 2009


On Tue, 2009-05-12 at 15:19 -0400, Abhishek Gupta wrote:
> Its MPICH2.

If you're using the mpiexec included with MPICH2, it's possible that you
are running out of privileged ports for the rsh connections to the other
nodes.  Try using OSC's mpiexec replacement [1] (which uses TORQUE's TM
API to start up the MPI processes), and see if that makes a difference.

[1] http://www.osc.edu/~pw/mpiexec/index.php

	--Troy
-- 
Troy Baer, HPC System Administrator
National Institute for Computational Sciences, University of Tennessee
http://www.nics.tennessee.edu/
Phone:  865-241-4233


More information about the torqueusers mailing list