[torqueusers] Timeout using mpiexec and torque

Eugene van den Hurk e.vandenhurk at bcri.ucc.ie
Tue Oct 31 11:15:24 MST 2006


I re-compiled mpiexec against torque 2.1.6 but I'm still getting 
errors along the lines of:

#####################################################################
p17_29269:  p4_error: Timeout in establishing connection to remote process: 0
rm_l_17_29270: (302.817528) net_send: could not write to fd=5, errno = 32
p33_29615:  p4_error: interrupt SIGx: 13
p17_29269: (367.028649) net_send: could not write to fd=5, errno = 32
p33_29615: (431.239899) net_send: could not write to fd=5, errno = 32
p1_5045:  p4_error: interrupt SIGx: 15
rm_l_1_5046: (916.888204) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: task 1 exited oddly---report bug: status 0 done 0.
mpiexec: Warning: tasks 17,33 exited with status 1.
#####################################################################


At 15:38 31/10/2006, you wrote:
>On Tuesday 31 October 2006 14:53, Eugene van den Hurk wrote:
>
>
> > I am using the following:
> > mpiexec 0.81
> > mpich 1.2.7p1.
> > Torque 2.1.6
> > Maui 3.2.6p16
> >
>
> > Just posting this to the list in case anybody else has had similar
> > issues in the past and might be able to shed some light on this.
> >
>
>No real idea, but just checking: did you [re]compile mpiexec against torque
>2.1.6, or was it statically linked against an older torque's libs ?
>
>
>
>
>
>
>
>_______________________________________________
>torqueusers mailing list
>torqueusers at supercluster.org
>http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list