[torqueusers] Torque 4.2 and OSC mpiexec

Sérgio Almeida sergio.almeida at ist.utl.pt
Mon Apr 15 10:57:14 MDT 2013


Hello,

We were using torque 2.x on our local cluster for a year and decided to 
upgrade to torque 4.2.2 with MAUI.

The upgrade went smoothly but we can't get OSC mpiexec to work. There is 
little info around the web about this combination.

OSC mpiexec fails to connect to the local mom:

reconnect_to_mom: mom died, trying continually to reconnect
(...)
reconnect_to_mom: walking existing task list and resubmitting obits
reconnect_to_mom: new obit for task 0
reconnect_to_mom: new obit for task 0

mpiexec.hydra works just out of the box and spans jobs across the whole 
cluster.

We are using mpich2 across the whole cluster and mpiexec has been 
compiled with torque 4.2.2.

Is this combination supposed to work? Does anyone else have/had this 
kind of issue?

Cheers,
Sérgio


More information about the torqueusers mailing list