[torqueusers] configuring machine as both server and compute node - interface name confusion
lev at columbia.edu
Sat Mar 22 20:55:00 MDT 2014
Received from Lev Givon on Thu, Mar 20, 2014 at 01:02:03PM EDT:
> Received from David Beer on Wed, Mar 19, 2014 at 06:38:25PM EDT:
> > On Wed, Mar 19, 2014 at 3:15 PM, Lev Givon <lev at columbia.edu> wrote:
> > > Any ideas as to why the name associated with the external interface is being
> > > used even though it is not specified anywhere in the torque configuration?
> > > Resolving the node01.local name via gethostbyname() returns the address of
> > > the internal interface because nsswitch.conf is configured to look at mdns
> > > when resolving names.
> > Lev,
> > The mom sends in its hostname that is returned to it through the system
> > call gethostname.
> > A simple workaround for this issue is to add the -A switch when pbs_mom is
> > started:
> > pbs_mom -A <node name in nodes file>
> Thanks, David.
> Doing the above prevents the previously reported error from occurring and does
> enable the machine to be listed as free by pbsnodes -a when I run it as root. If
> I attempt to run the command as a normal user (or try to submit a job), I
> observe the following error:
> $ pbsnodes -a
> Can not resolve name for server node01.local. (rc = -1 - Unknown error -1)
> Cannot resolve specified server host 'node01.local'.
> pbsnodes: cannot connect to server node01.local, error=15010 (Access from host
> not allowed, or unknown host)
> I tried adding <submit_hosts>node01.local</submit_hosts> to
> /var/spool/torque/server_priv/serverdb, but that didn't see to have any effect.
> Any thoughts as to why this is happening for non-root users?
For the record, the latter problem was due to /etc/nsswitch.conf not being world
More information about the torqueusers