[torqueusers] hostbased ssh mini-howto

widyono at seas.upenn.edu widyono at seas.upenn.edu
Thu Nov 3 10:38:03 MST 2005

Greetings all,

Here is a summary (e.g. a mini-HOWTO that hasn't been cleaned up) of using
OpenSSH as process transport on Linux clusters under Torque.  Hopefully it
will help others.  I use Fermi Scientific Linux 4, YMMV.  This is much more
than a couple of paragraphs but it may fill out some dark areas for
sysadmin-averse cluster operators.

Dan Widyono
Liniac Project
University of Pennsylvania


There are two general methods to use.  One will not be discussed here fully
but rather mentioned in passing, which is that each user gets an
empty-passphrase key, which is then copied into their authorized_keys file.
We used this for several years, and while it certainly works, it is awfully
ugly to manage.  Tip: use id_rsa_pbs as the key name so as not to interfere
with users who have their own ssh keys set up (for external connections).

In /etc/ssh/ssh_config use something like this:

Host node*
	IdentityFile ~/.ssh/id_rsa_pbs


We are moving toward the second method, hostbased authentication, but this
was initially set back by awful debugging output from openssh and poor
existing documentation on the web.  I finally bit the bullet and organized my
thoughts and tested a configuration on a test cluster, and ask for your
comments and feedback (*ESPECIALLY* regarding compression and cipher and how
they affect your throughput and latency with non-MPI but intercommunicating


Hostbased ssh setup, with torque access control and minor performance tweaks:

On the SSHD Server side (which means everywhere, BUT!!! head node with
external logins should have more secure sshd_config):


	/etc/ssh/sshd_config  (((  ON INTERNAL NODES ONLY!!!  )))
		# Safety valve (root)
		PubkeyAuthentication		yes
		# Main component
		HostbasedAuthentication		yes
		# /etc/pbs_sshauth with pam_listfile.so (see below)
		UsePAM				yes
		# Security measures
		IgnoreUserKnownHosts		yes
		IgnoreRhosts			yes
		PermitUserEnvironment		no
		UseLogin			no
		PermitRootLogin			without-password
		# Reduce latency for MPI
		LogLevel			ERROR
		Ciphers				blowfish-cbc
		Compression			no
		Protocol			2
		# You might want to change the following on the head
		# node, depending on your external network environment
		# and group preferences
		ChallengeResponseAuthentication	no
		PasswordAuthentication		no
		KerberosAuthentication		no
		GSSAPIAuthentication		no
		UseDNS				no
		PrintMotd			no
		PrintLastLog			no
		X11Forwarding			no
		# on head node this really should be yes
		StrictModes			no
		# Subsystem       sftp    /usr/libexec/openssh/sftp-server

		# Turn off IPV6 addresses

	/etc/pam.d/sshd  (modified to use pam_listfile.so for access control)
		# obviously on compute nodes only
		auth       required     pam_stack.so service=system-auth
		auth       required     pam_nologin.so
		account    required     pam_stack.so service=system-auth
		account    sufficient   pam_access.so
		account    required     pam_listfile.so file=/etc/pbs_sshauth onerr=fail sense=allow item=user
		password   required     pam_stack.so service=system-auth
		session    required     pam_stack.so service=system-auth
		# original, for sake of comparison
		#auth       required     pam_stack.so service=system-auth
		#auth       required     pam_nologin.so
		#account    required     pam_stack.so service=system-auth
		#password   required     pam_stack.so service=system-auth
		#session    required     pam_stack.so service=system-auth

	$PBS_DIR/mom_priv/prologue   AND   prologue.parallel
		# obviously on compute nodes only
		/bin/rm -f /etc/pbs_sshauth ; echo $2 > /etc/pbs_sshauth ; exit 0

	$PBS_DIR/mom_priv/epilogue   AND   epilogue.parallel
		# obviously on compute nodes only
		/bin/rm -f /etc/pbs_sshauth ; echo "" > /etc/pbs_sshauth ; exit 0

On the SSH Client side (everywhere):

		FallBackToRsh			no
		EnableSSHKeysign		yes
		Host	node*,headnode.internal.domain,headnode
			BatchMode			yes
			ConnectionAttempts		5
			ForwardX11			no
			HostbasedAuthentication		yes
			PreferredAuthentications	hostbased
			CheckHostIP			no
			UserKnownHostsFile		/dev/null
			Ciphers				blowfish-cbc
			Compression			no

Maintenance: shosts.equiv needs to be updated when new nodes are added.  You
could use netgroups for this, either NIS or a netgroup file (not tested by
myself, but I've read others doing so on Linux).  Probably you want to add
something at bootup to clear out /etc/pbs_sshauth.  Cipher/compression tweaks
as improvements come into existence, for performance gains.

More information about the torqueusers mailing list