[torqueusers] hostbased ssh mini-howto
widyono at seas.upenn.edu
widyono at seas.upenn.edu
Thu Nov 3 10:38:03 MST 2005
Greetings all,
Here is a summary (e.g. a mini-HOWTO that hasn't been cleaned up) of using
OpenSSH as process transport on Linux clusters under Torque. Hopefully it
will help others. I use Fermi Scientific Linux 4, YMMV. This is much more
than a couple of paragraphs but it may fill out some dark areas for
sysadmin-averse cluster operators.
Regards,
Dan Widyono
Liniac Project
University of Pennsylvania
============================================================================
There are two general methods to use. One will not be discussed here fully
but rather mentioned in passing, which is that each user gets an
empty-passphrase key, which is then copied into their authorized_keys file.
We used this for several years, and while it certainly works, it is awfully
ugly to manage. Tip: use id_rsa_pbs as the key name so as not to interfere
with users who have their own ssh keys set up (for external connections).
In /etc/ssh/ssh_config use something like this:
Host node*
IdentityFile ~/.ssh/id_rsa_pbs
============================================================================
We are moving toward the second method, hostbased authentication, but this
was initially set back by awful debugging output from openssh and poor
existing documentation on the web. I finally bit the bullet and organized my
thoughts and tested a configuration on a test cluster, and ask for your
comments and feedback (*ESPECIALLY* regarding compression and cipher and how
they affect your throughput and latency with non-MPI but intercommunicating
tasks).
============================================================================
Hostbased ssh setup, with torque access control and minor performance tweaks:
On the SSHD Server side (which means everywhere, BUT!!! head node with
external logins should have more secure sshd_config):
/etc/ssh/shosts.equiv
headnode.internal.domain
node1.internal.domain
node2.internal.domain
...
/etc/ssh/sshd_config ((( ON INTERNAL NODES ONLY!!! )))
# Safety valve (root)
PubkeyAuthentication yes
# Main component
HostbasedAuthentication yes
# /etc/pbs_sshauth with pam_listfile.so (see below)
UsePAM yes
# Security measures
IgnoreUserKnownHosts yes
IgnoreRhosts yes
PermitUserEnvironment no
UseLogin no
PermitRootLogin without-password
# Reduce latency for MPI
LogLevel ERROR
Ciphers blowfish-cbc
Compression no
Protocol 2
# You might want to change the following on the head
# node, depending on your external network environment
# and group preferences
ChallengeResponseAuthentication no
PasswordAuthentication no
KerberosAuthentication no
GSSAPIAuthentication no
UseDNS no
PrintMotd no
PrintLastLog no
X11Forwarding no
# on head node this really should be yes
StrictModes no
# REMOVE / COMMENT OUT SFTP SUBSYSTEM ON COMPUTE NODES
# Subsystem sftp /usr/libexec/openssh/sftp-server
/etc/sysconfig/sshd
# Turn off IPV6 addresses
OPTIONS="-4"
/etc/pam.d/sshd (modified to use pam_listfile.so for access control)
#%PAM-1.0
# obviously on compute nodes only
auth required pam_stack.so service=system-auth
auth required pam_nologin.so
account required pam_stack.so service=system-auth
account sufficient pam_access.so
account required pam_listfile.so file=/etc/pbs_sshauth onerr=fail sense=allow item=user
password required pam_stack.so service=system-auth
session required pam_stack.so service=system-auth
#
# original, for sake of comparison
#auth required pam_stack.so service=system-auth
#auth required pam_nologin.so
#account required pam_stack.so service=system-auth
#password required pam_stack.so service=system-auth
#session required pam_stack.so service=system-auth
$PBS_DIR/mom_priv/prologue AND prologue.parallel
#!/bin/sh
# obviously on compute nodes only
/bin/rm -f /etc/pbs_sshauth ; echo $2 > /etc/pbs_sshauth ; exit 0
$PBS_DIR/mom_priv/epilogue AND epilogue.parallel
#!/bin/sh
# obviously on compute nodes only
/bin/rm -f /etc/pbs_sshauth ; echo "" > /etc/pbs_sshauth ; exit 0
On the SSH Client side (everywhere):
/etc/ssh/ssh_config
FallBackToRsh no
EnableSSHKeysign yes
Host node*,headnode.internal.domain,headnode
BatchMode yes
ConnectionAttempts 5
ForwardX11 no
HostbasedAuthentication yes
PreferredAuthentications hostbased
CheckHostIP no
UserKnownHostsFile /dev/null
Ciphers blowfish-cbc
Compression no
Maintenance: shosts.equiv needs to be updated when new nodes are added. You
could use netgroups for this, either NIS or a netgroup file (not tested by
myself, but I've read others doing so on Linux). Probably you want to add
something at bootup to clear out /etc/pbs_sshauth. Cipher/compression tweaks
as improvements come into existence, for performance gains.
More information about the torqueusers
mailing list