[torquedev] torque-2.1.x cannot read its own configuration?

Garrick Staples garrick at usc.edu
Fri Aug 31 15:51:37 MDT 2007


On Fri, Aug 31, 2007 at 02:39:43PM -0700, Martin Siegert alleged:
> On Fri, Aug 31, 2007 at 02:14:36PM -0700, Garrick Staples wrote:
> > On Fri, Aug 31, 2007 at 02:02:19PM -0700, Martin Siegert alleged:
> > > Hi,
> > > 
> > > I am running into the following problem ever since we switched to
> > > torque-2.1.x (actually I tried only 2.1.2 and 2.1.6):
> > > 
> > > All commands are run by root:
> > > 
> > > # qmgr -c 'p s' > /etc/sysconfig/torque_server.conf
> > > # qmgr < /etc/sysconfig/torque_server.conf
> > > Max open servers: 4
> > > qmgr obj= svr=default: Unauthorized Request 
> > 
> > It's not that qmgr doesn't understand the request, it is that the request is
> > not authorized.  It is a permission problem.
> > 
> > When you run 'qmgr', it connects to the server hostname listed in
> > $PBS_SERVER_HOME/server_name.  The pbs_server daemon running on that host isn't
> > allowing your connection.  Your "server_name" file probably has something other
> > than the actual hostname.
> > 
> > Try 'qmgr localhost' or 'qmgr `hostname`' when reading in the config.
> 
> Thanks! 'qmgr localhost' has the same problem, but 'qmgr `hostname`'
> actually works!
> 
> Your assumption that the server_name file has something other than
> the actual hostname is correct - this is a multihomed server and the
> server_name file contains the hostname assiciated with the private
> cluster network.
> 
> Can somebody actually explain to me what the correct configuration is
> under these circumstances? As far as I know the server name has to
> be entered at (at least?) three places:

Just adjust the server's managers to include what you want.

 
> 1) /var/spool/torque/server_name (qmgr is reading this)
> 2) /var/spool/torque/torque.cfg (SERVERHOST specification; qsub is reading
>    this)
> 3) in the server's data base: qmgr -c 's s server_name = b001'
> 
> Until now I assumed that the same name would have to be entered in all
> three places - obviously incorrect. Thus, is the following correct:
> 1) contains `hostname` (assuming that qmgr is run on the server),
> 2) and 3) have the hostname associated with the private cluster
> interface. Correct?

The first is used by PBS clients to find the server.

The second is recorded into the job.  I think this is only used for interactive
jobs so that pbs_mom can find the running qsub process.

The third is the name that pbs_server uses for itself.

I agree that this doesn't seem entirely coherent, but these are potentally all
different values.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20070831/2236eb8b/attachment.bin


More information about the torquedev mailing list