[torqueusers] Torque module for pdsh ?

Brian O'Connor briano at sgi.com
Thu May 14 17:54:21 MDT 2009


Hmm,

     Torque is a batch scheduler not a cluster administration package.

Check out OSCAR http://svn.oscar.openclustergroup.org/trac/oscar/wiki
or rocks. http://www.rocksclusters.org/

Both of these are excellent packages that include lots of cluster admin
goodness.

I vote for oscar, OSCAR includes the c3(cluster command and control)
from
ornl http://www.csm.ornl.gov/torc/C3/index.html

OSCAR (and rocks) both support torque out of the box.


Brian O'Connor
-----------------------------------------------------------------------
SGI Consulting
Email: briano at sgi.com, Mobile +61 417 746 452
Phone: +61 3 9963 1900, Fax:  +61 3 9963 1902
357 Camberwell Road, Camberwell, Victoria, 3124
AUSTRALIA
http://www.sgi.com/support/services
----------------------------------------------------------------------- 

> -----Original Message-----
> From: torqueusers-bounces at supercluster.org 
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of 
> Kamil Kisiel
> Sent: Friday, 15 May 2009 7:30 AM
> To: Ole Holm Nielsen; torqueusers at supercluster.org
> Subject: Re: [torqueusers] Torque module for pdsh ?
> 
> On 14/05/09 12:41 , "Ole Holm Nielsen" 
> <Ole.H.Nielsen at fysik.dtu.dk> wrote:
> 
> > I was recommended to use Parallel Distributed Shell
> > http://sourceforge.net/projects/pdsh/ for parallel commands
> > on our cluster.  The pdsh command has a very nice flag
> > that will execute a command on the nodes belonging to
> > a certain jobid, but only if you use the Slurm resource manager.
> >  From the pdsh(1) man-page:
> > 
> > slurm module options
> >         The slurm module allows pdsh to target nodes based 
> on  currently  run-
> >         ning  SLURM jobs. The slurm module is typically 
> called after all other
> >         node selection options have been processed, and if 
> no nodes have  been
> >         selected,  the  module  will  attempt to read a 
> running jobid from the
> >         SLURM_JOBID environment variable (which is set when 
>  running  under  a
> >         SLURM  allocation).  If SLURM_JOBID references an 
> invalid job, it will
> >         be silently ignored.
> > 
> >         -j jobid[,jobid,...]
> >                Target list of nodes allocated to the  SLURM 
>  job  jobid.  This
> >                option  may  be  used  multiple  times to 
> target multiple SLURM
> >                jobs. The special argument "all" can  be  
> used  to  target  all
> >                nodes running SLURM jobs, e.g.  -j all.
> > 
> > Question: Did anyone already write a Torque module for pdsh 
> ? IMHO this
> > would be a very useful thing to have.
> > 
> > Thanks,
> > Ole Holm Nielsen
> > Technical University of Denmark
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> I've always been interested in having one for a while as 
> well. Haven't been
> able to set aside any time to look at an implementation though. 
> 
> 
> 
> Notice of Confidentiality: The information transmitted is 
> intended only for the
> person or entity to which it is addressed and may contain 
> confidential and/or
> privileged material. Any review, re-transmission, 
> dissemination or other use of 
> or taking of any action in reliance upon this information by 
> persons or entities
> other than the intended recipient is prohibited. If you 
> received this in error
> please contact the sender immediately by return electronic 
> transmission and then
> immediately delete this transmission including all 
> attachments without copying,
> distributing or disclosing the same.
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 


More information about the torqueusers mailing list