[torqueusers] ssh keys on compute nodes

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Wed Jun 16 17:28:13 MDT 2010


> -----Original Message-----
> From: Aaron Sims [mailto:aaron_sims at ncsu.edu]
> Sent: Thursday, 17 June 2010 2:26 AM
> To: torqueusers at supercluster.org
> Subject: [torqueusers] ssh keys on compute nodes
> 
> I am trying to submit a job to the queue using a qsub batch script.  If
> I include the head node as one of the machines in the script, everything
> works fine.   BUT, I do not necessarily want to use the head node to run
> compute tasks.  The head node can log into the compute nodes using ssh
> keys.  And each compute node can log into the head node using ssh keys.
> But, I do not have it set up for each compute node to be able to log
> into every other compute node.
> 
> Is it *really* required to have all compute nodes be able to log in to
> all other compute nodes in order to run jobs on nodes that do not use
> the head node as part of their job?
> Interactively, I can run a mpi job using mpirun that does not use the
> head node.  But if I specify the nodes I want to pick using PBS, it
> fails due to login problems.
> What am I missing here?

Hi Aaron,

If you use an MPI that supports the torque tm_spawn functionality (eg openmpi compiled with torque support), you will not need ssh to launch processes - nor will you need the -np 8 -machinefile $PBS_NODEFILE options.

-- Gareth


> 
> Thanks,
> Aaron
> 
>  i.e. error:
> (gnome-ssh-askpass:16045): Gtk-WARNING **: cannot open display:
> Permission denied, please try again.^M
> MPIRUN.compute1: Some node programs ended prematurely without connecting
> to mpirun.
> MPIRUN.compute1: No connection received from 4 node processes on node
> compute2
> 
> Here is my script:
> #!/bin/csh -f
> #PBS -N wrftest
> #PBS -q research
> #PBS -l nodes=compute1:ppn=4+compute2:ppn=4
> mpirun -np 8 -machinefile $PBS_NODEFILE /WRF/run/wrf.exe
> 



More information about the torqueusers mailing list