[torqueusers] LAM/MPI + Torque
SCIPIONI.Roberto at nims.go.jp
Fri Jun 29 23:28:18 MDT 2007
How do you set LAM/MPI in compilation to do that ?
Is it with
or something ?
> On Thu, 2007-06-28 at 17:41 -0700, SCIPIONI Roberto wrote:
> > As far as I understand you need to tell Torque to boot the LAM
> properly> inside the script
> This is not necessary if you use mpiexec. In our setting I configured
> LAM/MPI (our current version is lam-7.1.3) to use the resource manager
> interface. This is an option to the LAM/MPI configure script when you
> compile _LAM_. Then LAM will talk to torque directly to get the
> nodes,boot LAM and also distribute the jobs to the nodes directly
> (no need for
> OLD command (note you need to manually specify number of nodes):
> mpiexec -machinefile $PBS_NODEFILE -n ?? [your script]
> NEW command:
> mpiexec -boot [your script]
> Apart from being simple it has two other big advantages.
> * You get meaningful usage information from qstat etc, without
> this a
> running MPI job will appear to use no CPU.
> * All processes stay under control of the queue system. Manually
> running LAM (OLD command) with rsh/ssh tends to leave orphaned LAM
> daemons on nodes which have to be manually killed by the system
> administrator logging into each node checking the daemon is unused and
> then killing it. I used to do this about once a week.
> Fortunately the
> pestat command is useful for detecting orphaned daemons as it
> lists the
> number of job processes on each node and if you have more
> processes than
> jobs currently running on a node then you usually have an orphan.
> Currently we allow users to use either method, but I am only teaching
> the new command to new users.
> One point is that you need to use the mpiexec program that comes with
> LAM. There is another mpiexec program
> (http://www.osc.edu/~pw/mpiexec/index.php) which provides similar
> functionality for other MPI implementations but doesn't work with
> Dr Justin Finnerty
> Rm W3-1-218 Ph 49 (441) 798 3726
> Carl von Ossietzky Universität Oldenburg
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers