[torqueusers] undelivered output of jobs

Martin Siegert siegert at sfu.ca
Thu Jun 5 14:56:44 MDT 2008

On Thu, Jun 05, 2008 at 04:24:10PM -0400, Steve Young wrote:
> Each user running a job needs to have all the host keys for all the  
> machines they will run on saved in their known_hosts file.  This  
> includes the name of the torque server.

No, you don't need to store the keys in the users' known_hosts files;
store them in the system wide /etc/ssh/ssh_known_hosts file instead.

However, there is one more issue we just ran into:
To enable passwordless ssh we use hostbased authentication. This is
fine as long as the torque server is not on a public network.
However, if the torque server is also your login server/head node
for the cluster, this makes me nervous. Unfortunately there
appears to be nothing in the sshd configuration that allows to
restrict hostbased access from a particular network only.

I can think of two solutions to this problem:
1) Run a second sshd on the torque server that listens on a
   different port, e.g., 12345, and only that sshd is configured
   to allow hostbased access. Port 12345 is blocked on the public
   interface and torque is configured with RCP_ARGS="-P 12345 -rpB".
2) Use rcp instead of scp and access restrictions in /etc/xinetd.d/rsh
   and/or /etc/hosts.allow. (afaik this solution does not work on
   large clusters because rsh can run out of ports).

Anything else?


Martin Siegert
Head, Research Computing
WestGrid Site Lead
Client and Research Services               phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6

