[torqueusers] How to run mpirun of intel on torque

Gus Correa gus at ldeo.columbia.edu
Fri Dec 14 10:57:50 MST 2012


On 12/14/2012 07:52 AM, David Roman wrote:
> Hello,
>
> I am sorry, but my english is really sad. I thank you for your patience
> for that.
>
> I installed TORQUE 4.1.0, with MAU 3.3.1. I compiled OPENMPI with option
> --with-tm=/usr/local/torque.
>
> I disallowed ssh connections for my users on executable nodes. In
> /etc/ssh/sshd_config I set
>
> AllowGroups root admin
>
> This works fine.
>
> But now, I install MPI Intel Librarie and Ifort.
>
> When I open a interactive pbs session:
>
> qsub -I -l nodes=2:ppn=8
>
> I am connected on a node.
>
> I run mpi job
>
> mpirun -genv I_MPI_FABRICS_LIST tmi ./my_program
>
> But it can not start, because it cannot connect on the other node.
>
> If I append users in AllowGroups of /etc/ssh/sshd_config it works.
>
> But if I do this, all users can connect on executable nodes, without use
> torque, and this is bad.
>
> How can I do to disallow ssh connection without torque or make mpirun of
> intel works like openmpi, without ssh connection allowed for users ?
>
> Thank you
>
> David
>
>

Hi David

For what it is worth, I_MPI_FABRICS_LIST seems to be an Intel-MPI
environment variable, not OpenMPI.
I am not familiar to Intel-MPI, and I couldn't find out what
exactly a "tmi" fabric/network is.

**

The corresponding OpenMPI way to set runtime parameters is through
the "mca" parameters, and this includes the network fabric to use.
Say, if you want (Ethernet) tcp and intranode shared memory,
then add:
-mca btl tcp,sm,self
If you want Infinband and intranode shared memory, add:
-mca btl openib,sm,self

You can get a lot of information about your OpenMPI installation
running:

ompi_info

**

I wonder if the mpiexec you're using is really part of your OpenMPI,
and really the one you built with Torque/tm support,
or perhaps pointing inadvertently to your Intel-MPI mpiexec.
You cannot mix the various MPI implementations.
Would there be a problem with the PATH?
For OpenMPI you need to set both PATH and LD_LIBRARY_PATH properly.
The OpenMPI FAQ explain both the environment setup and the use of
"mca" parameters:

http://www.open-mpi.org/faq/

A very simple test of OpenMPI functionality is:

mpiexec hostname

There are also the connectivity_c.c, ring_c.c, and hello_c.c in
the OpenMPI "examples" directory, which you can compile
with mpicc and run with mpiexec.

**

Also, to prevent users from connecting directly to the nodes
(or more precisely, to the nodes where they don't have Torque jobs 
running), you can configure and install Torque with the pam module"

./configure --with-pam  <other Torque configure parameters>

**

I hope it helps,
Gus Correa


More information about the torqueusers mailing list