[torqueusers] Torque module for pdsh ?
Brian O'Connor
briano at sgi.com
Thu May 14 17:54:21 MDT 2009
Hmm,
Torque is a batch scheduler not a cluster administration package.
Check out OSCAR http://svn.oscar.openclustergroup.org/trac/oscar/wiki
or rocks. http://www.rocksclusters.org/
Both of these are excellent packages that include lots of cluster admin
goodness.
I vote for oscar, OSCAR includes the c3(cluster command and control)
from
ornl http://www.csm.ornl.gov/torc/C3/index.html
OSCAR (and rocks) both support torque out of the box.
Brian O'Connor
-----------------------------------------------------------------------
SGI Consulting
Email: briano at sgi.com, Mobile +61 417 746 452
Phone: +61 3 9963 1900, Fax: +61 3 9963 1902
357 Camberwell Road, Camberwell, Victoria, 3124
AUSTRALIA
http://www.sgi.com/support/services
-----------------------------------------------------------------------
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of
> Kamil Kisiel
> Sent: Friday, 15 May 2009 7:30 AM
> To: Ole Holm Nielsen; torqueusers at supercluster.org
> Subject: Re: [torqueusers] Torque module for pdsh ?
>
> On 14/05/09 12:41 , "Ole Holm Nielsen"
> <Ole.H.Nielsen at fysik.dtu.dk> wrote:
>
> > I was recommended to use Parallel Distributed Shell
> > http://sourceforge.net/projects/pdsh/ for parallel commands
> > on our cluster. The pdsh command has a very nice flag
> > that will execute a command on the nodes belonging to
> > a certain jobid, but only if you use the Slurm resource manager.
> > From the pdsh(1) man-page:
> >
> > slurm module options
> > The slurm module allows pdsh to target nodes based
> on currently run-
> > ning SLURM jobs. The slurm module is typically
> called after all other
> > node selection options have been processed, and if
> no nodes have been
> > selected, the module will attempt to read a
> running jobid from the
> > SLURM_JOBID environment variable (which is set when
> running under a
> > SLURM allocation). If SLURM_JOBID references an
> invalid job, it will
> > be silently ignored.
> >
> > -j jobid[,jobid,...]
> > Target list of nodes allocated to the SLURM
> job jobid. This
> > option may be used multiple times to
> target multiple SLURM
> > jobs. The special argument "all" can be
> used to target all
> > nodes running SLURM jobs, e.g. -j all.
> >
> > Question: Did anyone already write a Torque module for pdsh
> ? IMHO this
> > would be a very useful thing to have.
> >
> > Thanks,
> > Ole Holm Nielsen
> > Technical University of Denmark
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> I've always been interested in having one for a while as
> well. Haven't been
> able to set aside any time to look at an implementation though.
>
>
>
> Notice of Confidentiality: The information transmitted is
> intended only for the
> person or entity to which it is addressed and may contain
> confidential and/or
> privileged material. Any review, re-transmission,
> dissemination or other use of
> or taking of any action in reliance upon this information by
> persons or entities
> other than the intended recipient is prohibited. If you
> received this in error
> please contact the sender immediately by return electronic
> transmission and then
> immediately delete this transmission including all
> attachments without copying,
> distributing or disclosing the same.
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
More information about the torqueusers
mailing list