[torqueusers] Torque module for pdsh ?
kamil at zymeworks.com
Thu May 14 15:30:23 MDT 2009
On 14/05/09 12:41 , "Ole Holm Nielsen" <Ole.H.Nielsen at fysik.dtu.dk> wrote:
> I was recommended to use Parallel Distributed Shell
> http://sourceforge.net/projects/pdsh/ for parallel commands
> on our cluster. The pdsh command has a very nice flag
> that will execute a command on the nodes belonging to
> a certain jobid, but only if you use the Slurm resource manager.
> From the pdsh(1) man-page:
> slurm module options
> The slurm module allows pdsh to target nodes based on currently run-
> ning SLURM jobs. The slurm module is typically called after all other
> node selection options have been processed, and if no nodes have been
> selected, the module will attempt to read a running jobid from the
> SLURM_JOBID environment variable (which is set when running under a
> SLURM allocation). If SLURM_JOBID references an invalid job, it will
> be silently ignored.
> -j jobid[,jobid,...]
> Target list of nodes allocated to the SLURM job jobid. This
> option may be used multiple times to target multiple SLURM
> jobs. The special argument "all" can be used to target all
> nodes running SLURM jobs, e.g. -j all.
> Question: Did anyone already write a Torque module for pdsh ? IMHO this
> would be a very useful thing to have.
> Ole Holm Nielsen
> Technical University of Denmark
> torqueusers mailing list
> torqueusers at supercluster.org
I've always been interested in having one for a while as well. Haven't been
able to set aside any time to look at an implementation though.
Notice of Confidentiality: The information transmitted is intended only for the
person or entity to which it is addressed and may contain confidential and/or
privileged material. Any review, re-transmission, dissemination or other use of
or taking of any action in reliance upon this information by persons or entities
other than the intended recipient is prohibited. If you received this in error
please contact the sender immediately by return electronic transmission and then
immediately delete this transmission including all attachments without copying,
distributing or disclosing the same.
More information about the torqueusers