[torqueusers] mpiexec jobs got stuck
Troy Baer
tbaer at utk.edu
Tue May 12 13:40:35 MDT 2009
On Tue, 2009-05-12 at 15:19 -0400, Abhishek Gupta wrote:
> Its MPICH2.
If you're using the mpiexec included with MPICH2, it's possible that you
are running out of privileged ports for the rsh connections to the other
nodes. Try using OSC's mpiexec replacement [1] (which uses TORQUE's TM
API to start up the MPI processes), and see if that makes a difference.
[1] http://www.osc.edu/~pw/mpiexec/index.php
--Troy
--
Troy Baer, HPC System Administrator
National Institute for Computational Sciences, University of Tennessee
http://www.nics.tennessee.edu/
Phone: 865-241-4233
More information about the torqueusers
mailing list