[torqueusers] configuring machine as both server and compute node - interface name confusion

Lev Givon lev at columbia.edu
Wed Mar 19 15:15:28 MDT 2014

I'm trying to configure a system running Ubuntu 13.10 (x86_64) and torque
4.5.0pre1 (manually compiled and installed) to serve both as a torque server and
a compute node. This machine has both a public and internal network interface;
the latter is connected to a private network ( that communicates
with other Ubuntu 13.10 systems (which each have a single interface attached to
the private network) that will eventually be added to the torque configuration
as compute nodes. I've configured the system to set the hostname associated with
its internal interface (node01.local) using avahi (zeroconf); I've verified that
I can use this hostname to access the system on the internal network. I used
this hostname in the pbs_server and pbs_mom configurations (i.e.,
/var/spool/torque/torque.cfg, /var/spool/torque/mom_priv/config,
/var/spool/torque/server_priv/nodes, and
/var/spool/torque/server_priv/serverdb); when I start all of the torque daemons
(pbs_server, pbs_sched, pbs_mom, and trqauthd), however, it seems that
pbs_server tries to use the name associated with the external interface (master)
despite what is specified in the config files (excerpt from the server logs):

03/19/2014 14:50:31;0006;PBS_Server.1913;Svr;PBS_Server;Using ports Server:15001
Scheduler:15004  MOM:15002 (server: 'master.ee.columbia.edu')
03/19/2014 14:51:01;0001;PBS_Server.1920;Svr;PBS_Server;LOG_ERROR::get_node_from_str, Node
node01.local is reporting on node master, which pbs_server doesn't know about

Any ideas as to why the name associated with the external interface is being
used even though it is not specified anywhere in the torque configuration?
Resolving the node01.local name via gethostbyname() returns the address of the
internal interface because nsswitch.conf is configured to look at mdns when
resolving names.
Lev Givon
Bionet Group

More information about the torqueusers mailing list