[torqueusers] configuring machine as both server and compute node - interface name confusion
lev at columbia.edu
Thu Mar 20 11:02:03 MDT 2014
Received from David Beer on Wed, Mar 19, 2014 at 06:38:25PM EDT:
> On Wed, Mar 19, 2014 at 3:15 PM, Lev Givon <lev at columbia.edu> wrote:
> > Any ideas as to why the name associated with the external interface is being
> > used even though it is not specified anywhere in the torque configuration?
> > Resolving the node01.local name via gethostbyname() returns the address of
> > the internal interface because nsswitch.conf is configured to look at mdns
> > when resolving names.
> The mom sends in its hostname that is returned to it through the system
> call gethostname.
> A simple workaround for this issue is to add the -A switch when pbs_mom is
> pbs_mom -A <node name in nodes file>
Doing the above prevents the previously reported error from occurring and does
enable the machine to be listed as free by pbsnodes -a when I run it as root. If
I attempt to run the command as a normal user (or try to submit a job), I
observe the following error:
$ pbsnodes -a
Can not resolve name for server node01.local. (rc = -1 - Unknown error -1)
Cannot resolve specified server host 'node01.local'.
pbsnodes: cannot connect to server node01.local, error=15010 (Access from host
not allowed, or unknown host)
I tried adding <submit_hosts>node01.local</submit_hosts> to
/var/spool/torque/server_priv/serverdb, but that didn't see to have any effect.
Any thoughts as to why this is happening for non-root users?
More information about the torqueusers