[torqueusers] feature request

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Sat Feb 27 18:26:05 MST 2010



> -----Original Message-----
> From: Gabe Turner [mailto:gabe at msi.umn.edu]
> Sent: Friday, 26 February 2010 2:40 PM
> To: torqueusers at supercluster.org
> Subject: Re: [torqueusers] feature request
> 
> On Fri, Feb 26, 2010 at 10:01:27AM +1100, Gareth.Williams at csiro.au wrote:
> > We have a similar wrapper, also without much error checking - it assumes
> very simple ssh syntax only:
> >
> > > cat /tools/ascutils/bin/pbsssh
> > #!/bin/bash
> >
> > usage="usage: $0 <node name> <command>"
> >
> > if [ $# -lt 2 ]
> > then
> >         echo $usage
> >         exit
> > fi
> >
> > node=$1
> >
> > shift
> >
> > pbsdsh -h $node $*
> 
> Gareth, have you gotten this to scale?  I've also been looking for a way
> to
> trick HP-MPI (now Platform MPI) into using the PBS TM, but I have not been
> able to get pbsdsh to scale beyond a couple of hundred nodes.  The Mother
> Superior basically starts taking up 100% of a core and stops responding.
> I'd like to be able to get it up to at least 1024 pbsdsh calls, and it
> doesn't get even close.
> 
> --
> Gabe Turner                                             gabe at msi.umn.edu
> HPC Systems Administrator,
> University of Minnesota
> Supercomputing Institute                          http://www.msi.umn.edu


Hi Gabe,

We've only used that wrapper for very modest parallelism, so no good tests of scale, sorry.

Maybe getting OSC's mpiexec to support/handle HP-MPI would be possible - it is not currently listed as a supported MPI.

Else, maybe a similar wrapper to work with openmpi (with tm) mpirun or OSC mpiexec would be possible and might scale better.

-- Gareth




More information about the torqueusers mailing list