[torqueusers] OpenMPI mpirun problem with TORQUE

이정현 bugslayer at gmail.com
Sat Jan 23 16:03:15 MST 2010

Hi all.


I have little (but serious) problem when submitting a job using mpirun.


There’s no problem with just “1” node (many processors) like below.


(job script)



#PBS -l nodes=1:ppn=2

#PBS -j oe





mpirun /home/jhlee/test_program

echo "finish : $(date)"



(result) - test_program just prints message whether it is executed by
mpirun or not. 


start  : Sun Jan 24 07:46:27 KST 2010

HOSTNAME : simulation01

PBS_NODEFILE = /var/spool/torque/aux//31.simulation00



Detected OpenMPI Runtime Environment

Detected OpenMPI Runtime Environment

finish : Sun Jan 24 07:46:29 KST 2010


But with many nodes like below, mpirun cannot make test_program start.


#PBS -l nodes=2:ppn=2 (other things are same)


I can’t find any process. There’s only mpirun, no ‘test_program’.
Please check the ‘ps’ result below.


21680 ?        S      0:00 mpirun /home/jhlee/test_program

21684 ?        Ss     0:00 bash -c ps ax | grep test

21712 ?        R      0:00 grep test


1.     mpirun(not via TORQUE) works correctly.

2.     OpenMPI was built with -with-tm option.

3.     iptables, selinux has been shutdown already. And no password is
required to connect other nodes using ssh.

4.     OpenMPI 1.4.1, TORQUE 2.4.4


What can I check to solve this ?






Jeong-hyun Lee


Visual Simulation Laboratory 

Department of Computer Science and Engineering 

Dongguk University, Seoul, Korea 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100124/0b40ec4a/attachment.html 

More information about the torqueusers mailing list