[torqueusers] qsub and mpiexec -f machinefile

Tiago Silva (Cefas) tiago.silva at cefas.co.uk
Fri Feb 21 05:07:18 MST 2014


I am using mpich2 1.5 with hydra. Now that I think of it the behaviour is as I expect it: the machinefile binds ranks with nodes. I have compiled it also with Openmpi 1.6.5 but I don't remember the behaviour and our openmpi is not integrated with torque.

Thanks for the suggestion, I will try to use PBS_NODEFILE to generate a machinefile on the fly.

Tiago

[hyde at deepgreen PP]$ which mpiexec
/apps/mpich2/1.5/ifort/bin/mpiexec
[hyde at deepgreen PP]$ mpirun -info
HYDRA build details:
    Version:                                 1.5
    Release Date:                            Mon Oct  8 14:00:48 CDT 2012
(...)

-----Original Message-----
From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Gus Correa
Sent: 20 February 2014 19:12
To: Torque Users Mailing List
Subject: Re: [torqueusers] qsub and mpiexec -f machinefile

Hi Tiago

Which MPI and which mpiexec are you using?
I am not familiar to all of them, but the behavior depends primarily on which you are using.
Most likely, by default you will get the sequential rank-to-node mapping that you mentioned.
Have you tried it?
What result did you get?

You can insert the MPI function MPI_Get_processor_name, early on your code, say, right after MPI_Init, MPI_Comm_size, and MPI_Comm_rank, and then printout the pairs rank and processor name (which will probably be your nodes' names).

https://www.open-mpi.org/doc/v1.4/man3/MPI_Get_processor_name.3.php
http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Get_processor_name.html

With OpenMPI there are easier ways (through mpiexec) to report this information.

However, there are ways to change the sequential rank-to-node mapping, if this is your goal, again, depending on which mpiexec you are using.

Anyway, this is more of an MPI then of a Torque question.

I hope this helps,
Gus Correa


On 02/20/2014 04:51 AM, Tiago Silva (Cefas) wrote:
> Thanks, this seems promising. Before I try building with openmpi, if I 
> parse PBS_NODEFILE to produce my own machinefile for mpiexec, for 
> instance following my previous example:
>
> n100
> n100
> n101
> n101
> n101
> n101
>
> won't mpiexec start mpi processes with ranks 0-1 onto n100 and with 
> rank
> 2-5 on n101? That what I think it does when I don't use qsub.
>
> Tiago
>
>  > -----Original Message-----
>  > From: torqueusers-bounces at supercluster.org
> <mailto:torqueusers-bounces at supercluster.org> [mailto:torqueusers-  > 
> bounces at supercluster.org] On Behalf Of Gus Correa  > Sent: 19 February 
> 2014 15:11  > To: Torque Users Mailing List  > Subject: Re: 
> [torqueusers] qsub and mpiexec -f machinefile  >  > Hi Tiago  >  > The 
> Torque/PBS node file is available to your job script through the  > 
> environmnent variable $PBS_NODEFILE.
>  > This file has one line listing the node name for each 
> processor/core  > that you requested.
>  > Just do a "cat $PBS_NODEFILE" inside your job script to see how it  
> > looks.
>  > Inside your job script, and before the mpiexec command, you can run 
> a  > brief auxiliary script to create the machinefile you need from 
> the the  > $PBS_NODEFILE.
>  > You will need to create this auxiliary script, tailored to your  > 
> application.
>  > Still, this method won't bind the MPI processes to the appropriate  
> > hardware components (cores, sockets, etc), (in case this is also 
> part  > of your goal).
>  >
>  > Having said that, if you are using OpenMPI, it can be built with 
> Torque  > support (with the --with-tm=/torque/location configuration option).
>  > This would give you a range of options on how to assign different  
> > cores, sockets, etc, to different MPI ranks/processes, directly in 
> the  > mpiexec command, or in the OpenMPI runtime configuration files.
>  > This method would't require creating the machinefile from the  > 
> PBS_NODEFILE.
>  > This second approach has the advantage of allowing you to bind the  
> > processes to cores, sockets, etc.
>  >
>  > I hope this helps,
>  > Gus Correa
>  >
>  > n 02/19/2014 07:40 AM, Tiago Silva (Cefas) wrote:
>  > > Hi,
>  > >
>  > > My MPI code is normally executed across a set of nodes with 
> something  > like:
>  > >
>  > > mpiexec -f machinefile -np 6 ./bin  > >  > > where the 
> machinefile has 6 entries with node names, for instance:
>  > >
>  > > n01
>  > >
>  > > n01
>  > >
>  > > n02
>  > >
>  > > n02
>  > >
>  > > n02
>  > >
>  > > n02
>  > >
>  > > Now the issue here is that this list has been optimised to 
> balance  > the  > > load between nodes and to reduce internode 
> communication. So for  > > instance model domain tiles 0 and 1 will 
> run on n01 while tiles 2 to  > 5  > > will run on n02.
>  > >
>  > > Is there a way to integrate this into qsub since I don't know 
> which  > > nodes will be assigned before submission? Or in other words 
> can I  > > control grouping processes in one node?
>  > >
>  > > In my example I used 6 processes for simplicity but normally I  > 
> > parallelise across 4-16 nodes and >100 processes.
>  > >
>  > > Thanks,
>  > >
>  > > tiago
>  > >
>  > >
>  > >
>  > >
>  > >
>  > > This email and any attachments are intended for the named 
> recipient  > > only. Its unauthorised use, distribution, disclosure, 
> storage or  > > copying is not permitted. If you have received it in 
> error, please  > > destroy all copies and notify the sender. In 
> messages of a  > > non-business nature, the views and opinions 
> expressed are the  > author's  > > own and do not necessarily reflect 
> those of Cefas. Communications on  > > Cefas' computer systems may be 
> monitored and/or recorded to secure  > the  > > effective operation of 
> the system and for other lawful purposes.
>  > >
>  > >
>  > >
>  > >
>  > > _______________________________________________
>  > > torqueusers mailing list
>  > > torqueusers at supercluster.org 
> <mailto:torqueusers at supercluster.org>
>  > > http://www.supercluster.org/mailman/listinfo/torqueusers
>  >
>  > _______________________________________________
>  > torqueusers mailing list
>  > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>  > http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>
>
> This email and any attachments are intended for the named recipient 
> only. Its unauthorised use, distribution, disclosure, storage or 
> copying is not permitted. If you have received it in error, please 
> destroy all copies and notify the sender. In messages of a 
> non-business nature, the views and opinions expressed are the author's 
> own and do not necessarily reflect those of Cefas. Communications on 
> Cefas' computer systems may be monitored and/or recorded to secure the 
> effective operation of the system and for other lawful purposes.
>
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
This email and any attachments are intended for the named recipient only. Its unauthorised use, distribution, disclosure, storage or copying is not permitted.
If you have received it in error, please destroy all copies and notify the sender. In messages of a non-business nature, the views and opinions expressed are the author's own
and do not necessarily reflect those of Cefas. 
Communications on Cefas’ computer systems may be monitored and/or recorded to secure the effective operation of the system and for other lawful purposes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20140221/1759fed3/attachment.html 


More information about the torqueusers mailing list