[torqueusers] I get 'Unauthorized Request' to every qmgr command
siegert at sfu.ca
Sun Mar 26 20:26:39 MST 2006
On Sun, Mar 26, 2006 at 09:23:34PM -0500, Stewart.Samuels at sanofi-aventis.com wrote:
> The $PBS_HOME/server_name needs to be the same name as that associated with your primary interface. This is usually the name associated with eth0 on multi-homed systems but not necessarily. If you are running a peer-to-peer cluster, that is, all nodes have a single connection to a common network, this would be the case. If you are running a beowulf cluster with a public and private network, then the name in the server_name file should match that name via DNS or /etc/hosts that matches the "public" interface.
> I use 3 interfaces. One for the private network (eth0), one for the public network (eth1), and one for HA (eth2). The name I have to compile torque with is that associated with eth1. This will place in the "server_name" file the correct name.
> Hope this clarifies things.
This is not necessary. I have 3 interfaces as well
eth0: public interface, associated with the hostname, DNS name of the box
eth1: private interface, name r1
eth2: NFS networkR, name r1-nfs
The compute nodes only know the torque server by the name r1 associated
with eth1. Torque is compiled using that name and the server_name file
contains that name as well. Furthermore, the $PBS_HOME/torque.cfg file
contains a line
Works like a charm.
Head, HPC at SFU
WestGrid Site Manager
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
> -----Original Message-----
> From: Mikko Huhtala [mailto:mhuhtala at abo.fi]
> Sent: Sunday, March 26, 2006 12:38 PM
> To: Samuels, Stewart PH/US; torqueusers at supercluster.org
> Subject: RE: [torqueusers] I get 'Unauthorized Request' to every qmgr
> Stewart.Samuels at sanofi-aventis.com writes:
> > Mikko,
> > You might want to check the the name in the $PBS_HOME/server_name file is correct and matches what torque was compiled with. That is, what the mom's think is the server. A mismatch here can trigger this type of an error. It has happened to me.
> > Stewart
> server_name is correct, but the culprit is /etc/hosts. This machine
> has multiple network interfaces and the wrong one was listed
> first. So, for the record, server_name must match the first
> no-localhost name in /etc/hosts, I guess, or disaster will ensue. Is
> that right? On this machine, there are 3 network interfaces, and the
> order of eth0 eth1 eth2 does not seem to matter, but only the order in
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers