[torqueusers] mpiexec jobs got stuck
Abhishek Gupta
abhig at Princeton.EDU
Tue May 12 15:03:40 MDT 2009
It is giving me an error:
/mpiexec: Error: get_hosts: pbs_statjob returned neither "ncpus" nor "nodect"
/
Any suggestion?
Thanks,
Abhi.
Troy Baer wrote:
> On Tue, 2009-05-12 at 15:19 -0400, Abhishek Gupta wrote:
>
>> Its MPICH2.
>>
>
> If you're using the mpiexec included with MPICH2, it's possible that you
> are running out of privileged ports for the rsh connections to the other
> nodes. Try using OSC's mpiexec replacement [1] (which uses TORQUE's TM
> API to start up the MPI processes), and see if that makes a difference.
>
> [1] http://www.osc.edu/~pw/mpiexec/index.php
>
> --Troy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20090512/93591c07/attachment.html
More information about the torqueusers
mailing list