[torqueusers] configuring machine as both server and compute node - interface name confusion

Lev Givon lev at columbia.edu
Thu Mar 20 11:02:03 MDT 2014


Received from David Beer on Wed, Mar 19, 2014 at 06:38:25PM EDT:
> On Wed, Mar 19, 2014 at 3:15 PM, Lev Givon <lev at columbia.edu> wrote:

(snip)

> > Any ideas as to why the name associated with the external interface is being
> > used even though it is not specified anywhere in the torque configuration?
> > Resolving the node01.local name via gethostbyname() returns the address of
> > the internal interface because nsswitch.conf is configured to look at mdns
> > when resolving names.
>
> Lev,
> 
> The mom sends in its hostname that is returned to it through the system
> call gethostname.
> 
> A simple workaround for this issue is to add the -A switch when pbs_mom is
> started:
> 
> pbs_mom -A <node name in nodes file>

Thanks, David. 

Doing the above prevents the previously reported error from occurring and does
enable the machine to be listed as free by pbsnodes -a when I run it as root. If
I attempt to run the command as a normal user (or try to submit a job), I
observe the following error:

$ pbsnodes -a                                                                                       
Can not resolve name for server node01.local. (rc = -1 - Unknown error -1)
Cannot resolve specified server host 'node01.local'.
pbsnodes: cannot connect to server node01.local, error=15010 (Access from host
not allowed, or unknown host)

I tried adding <submit_hosts>node01.local</submit_hosts> to 
/var/spool/torque/server_priv/serverdb, but that didn't see to have any effect.

Any thoughts as to why this is happening for non-root users?
-- 
Lev Givon
Bionet Group
http://www.columbia.edu/~lev/
http://lebedov.github.io/



More information about the torqueusers mailing list