[torqueusers] Torque module for pdsh ?
jbernstein at penguincomputing.com
Thu May 14 13:51:43 MDT 2009
TORQUE comes with a similar tool called pbsdsh. You might want to have a look
at that as it does the same sort of thing as pdsh but fashioned in a manner
suitable for TORQUE.
Ole Holm Nielsen wrote:
> I was recommended to use Parallel Distributed Shell
> http://sourceforge.net/projects/pdsh/ for parallel commands
> on our cluster. The pdsh command has a very nice flag
> that will execute a command on the nodes belonging to
> a certain jobid, but only if you use the Slurm resource manager.
> From the pdsh(1) man-page:
> slurm module options
> The slurm module allows pdsh to target nodes based on currently run-
> ning SLURM jobs. The slurm module is typically called after all other
> node selection options have been processed, and if no nodes have been
> selected, the module will attempt to read a running jobid from the
> SLURM_JOBID environment variable (which is set when running under a
> SLURM allocation). If SLURM_JOBID references an invalid job, it will
> be silently ignored.
> -j jobid[,jobid,...]
> Target list of nodes allocated to the SLURM job jobid. This
> option may be used multiple times to target multiple SLURM
> jobs. The special argument "all" can be used to target all
> nodes running SLURM jobs, e.g. -j all.
> Question: Did anyone already write a Torque module for pdsh ? IMHO this
> would be a very useful thing to have.
> Ole Holm Nielsen
> Technical University of Denmark
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers