[torqueusers] ssh keys on compute nodes

Aaron Sims aaron_sims at ncsu.edu
Wed Jun 16 10:26:28 MDT 2010

I am trying to submit a job to the queue using a qsub batch script.  If 
I include the head node as one of the machines in the script, everything 
works fine.   BUT, I do not necessarily want to use the head node to run 
compute tasks.  The head node can log into the compute nodes using ssh 
keys.  And each compute node can log into the head node using ssh keys.  
But, I do not have it set up for each compute node to be able to log 
into every other compute node. 

Is it *really* required to have all compute nodes be able to log in to 
all other compute nodes in order to run jobs on nodes that do not use 
the head node as part of their job? 
Interactively, I can run a mpi job using mpirun that does not use the 
head node.  But if I specify the nodes I want to pick using PBS, it 
fails due to login problems.
What am I missing here?


 i.e. error:
(gnome-ssh-askpass:16045): Gtk-WARNING **: cannot open display:
Permission denied, please try again.^M
MPIRUN.compute1: Some node programs ended prematurely without connecting 
to mpirun.
MPIRUN.compute1: No connection received from 4 node processes on node 

Here is my script:
#!/bin/csh -f
#PBS -N wrftest
#PBS -q research
#PBS -l nodes=compute1:ppn=4+compute2:ppn=4
mpirun -np 8 -machinefile $PBS_NODEFILE /WRF/run/wrf.exe

