[torqueusers] Torque 4.2 and OSC mpiexec

Peter A Ruprecht peter.ruprecht at Colorado.EDU
Mon Jun 17 09:10:53 MDT 2013


Hello Sergio,

This isn't exactly comparable to what you're reporting, but we had to
recompile OpenMPI when moving from Torque 2.5 to 4.x.  There was some
issue with the task manager interface, and while several list members
offered suggestions on how to work around that, we ended up recompiling
anyway.  You can see some of the discussion in the list archives from June
2012.  Perhaps you can find some clues that could be relevant to your
situation.

Regards,
Peter Ruprecht

On 4/15/13 10:57 AM, "Sérgio Almeida" <sergio.almeida at ist.utl.pt> wrote:

>Hello,
>
>We were using torque 2.x on our local cluster for a year and decided to
>upgrade to torque 4.2.2 with MAUI.
>
>The upgrade went smoothly but we can't get OSC mpiexec to work. There is
>little info around the web about this combination.
>
>OSC mpiexec fails to connect to the local mom:
>
>reconnect_to_mom: mom died, trying continually to reconnect
>(...)
>reconnect_to_mom: walking existing task list and resubmitting obits
>reconnect_to_mom: new obit for task 0
>reconnect_to_mom: new obit for task 0
>
>mpiexec.hydra works just out of the box and spans jobs across the whole
>cluster.
>
>We are using mpich2 across the whole cluster and mpiexec has been
>compiled with torque 4.2.2.
>
>Is this combination supposed to work? Does anyone else have/had this
>kind of issue?
>
>Cheers,
>Sérgio
>_______________________________________________
>torqueusers mailing list
>torqueusers at supercluster.org
>http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list