[torqueusers] Torque module for pdsh ?
Si Hammond
simon.hammond at gmail.com
Thu May 14 13:51:44 MDT 2009
Have you looked at the pbsdsh command?
Si Hammond
High Performance Systems Group
University of Warwick
On 14 May 2009, at 20:41, Ole Holm Nielsen wrote:
> I was recommended to use Parallel Distributed Shell
> http://sourceforge.net/projects/pdsh/ for parallel commands
> on our cluster. The pdsh command has a very nice flag
> that will execute a command on the nodes belonging to
> a certain jobid, but only if you use the Slurm resource manager.
> From the pdsh(1) man-page:
>
> slurm module options
> The slurm module allows pdsh to target nodes based on
> currently run-
> ning SLURM jobs. The slurm module is typically called after
> all other
> node selection options have been processed, and if no nodes
> have been
> selected, the module will attempt to read a running jobid
> from the
> SLURM_JOBID environment variable (which is set when running
> under a
> SLURM allocation). If SLURM_JOBID references an invalid
> job, it will
> be silently ignored.
>
> -j jobid[,jobid,...]
> Target list of nodes allocated to the SLURM job
> jobid. This
> option may be used multiple times to target
> multiple SLURM
> jobs. The special argument "all" can be used to
> target all
> nodes running SLURM jobs, e.g. -j all.
>
> Question: Did anyone already write a Torque module for pdsh ? IMHO
> this
> would be a very useful thing to have.
>
> Thanks,
> Ole Holm Nielsen
> Technical University of Denmark
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
Si Hammond
Performance Modelling, Analysis and Optimisation Team
High Performance Systems Group
Department of Computer Science
University of Warwick, CV4 7AL, UK
More information about the torqueusers
mailing list