[torquedev] torque server: setting the server name

Martin Siegert siegert at sfu.ca
Fri Jun 1 17:20:04 MDT 2012


moving this to the dev list ...

On Tue, May 29, 2012 at 01:36:15PM -0700, Martin Siegert wrote:
> Hi David,
> I will definitely add --with-tcp-retry-limit=5 to my configure options,
> since we did run into exactly that situation. However, the current
> situation is due to an ip mismatch between private and public ip address
> of the torque server: svr_connect.c, line 172
>   if ((hostaddr == pbs_server_addr) && (port == pbs_server_port_dis))
>     {
>     return(PBS_LOCAL_CONNECTION); /* special value for local */
>     }
> In our case: hostaddr = and pbs_server_addr =
> The former ip address is the (correct) ip address on the internal
> cluster network, the latter ip address is the public ip address and
> should not be used by torque anywhere.
> We have in /etc/hosts
> b0
> and then set the server name in 4 (!!) different places:
> 1) in qmgr we have
> set server server_name = b0
> 2) /var/spool/torque/server_name contains b0
> 3) /var/spool/torque/torque.cfg contains
> 4) we configure with
> --with-default-server=b0
> I always thought that it should be sufficient to set this once.
> Obviously I am wrong ... I am missing at least a fifth spot where
> I need to set this: how do I get torque server to set pbs_server_addr
> in svr_connect to
> For now we used the following workaround:
> 1) in /etc/hosts set
> hostname.domain.ca hostname b0
> 2) restart torque server and wait a few seconds until qstat, etc.
> responds.
> 3) change /etc/hosts back to
> b0
> This does "solve" the problem for now.
> I am still looking for a more permanent solution.

I did miss a fifth (and actually 6th) way of setting the server name:

5) start the server with the -H b0 commandline option.

As it turns out this is the only way. Methods 1-4 have no effect.

At this point I am wondering why we need 5 ways of setting the server
name. As a first step can somebody tell me what each of the 5 settings

This is my take:

1) in qmgr:

set server server_name = b0

As far as I can tell this has no effect. Can this be eliminated?

2) /var/spool/torque/server_name

This is essential: used by the clients (qsub, qstat, etc.) and also by
the mom (if no $pbsserver is specified in mom_priv/config). Not used
by the torque server.

3) torque.cfg

Read by qsub only. The man page says:
SERVERHOST specifies the value for the PBS_SERVER environment variable

I find this confusing: why would you want to set that environment variable
to something different than what is read from the server_name file?
In other words: what is the use case for having SERVERHOST set to something
different than what is in the server_name file?

Is it safe to say that this is not needed when the server_name file is in

4) configure option --with-default-server=b0
Does this have any effect?

5) pbs_server -H b0 commandline option
essential. Determines the ip address to be used for the server.
If not used, gethostname is used to determine the ipaddress.

6) $pbsserver setting in mom_priv/config
Used by the mom for connecting to server; not needed when
server_name file is in place.

Is my assessment correct that only (2) and (5) are really needed?
Furthermore, (1) and (4) and possibly (3) do not serve any purpose?


