[torqueusers] jobids and server names?
Roy Dragseth
roy.dragseth at cc.uit.no
Wed Dec 5 13:15:14 MST 2007
On Saturday 22 September 2007, Roy Dragseth wrote:
> On Friday 21 September 2007, Garrick Staples wrote:
> > On Fri, Sep 21, 2007 at 09:44:19AM +0200, Roy Dragseth alleged:
> > > Hi.
> > >
> > > Is it possible to change the name string that gets attached to the
> > > jobid number to anything else than the name of the server running
> > > pbs_server?
> > >
> > > I want to set up a cluster with login nodes and hide the real frontend
> > > from the users in the following way:
> > >
> > > Public name: my_cluster.domain.org
> > > Login node 1: my_login1.domain.org
> > > Login node 2: my_login2.domain.org
> > > Core node: my_cluster_core.domain.org
> > >
> > > my_login1 and my_login2 shall have some ip takeover mechanism for the
> > > address associated with my_cluster.domain.org.
> > >
> > > pbs_server runs on my_cluster_core, and through the standard config all
> > > jobs would have jobids like 12345.my_cluster_core.domain.org.
> > > Is it possible to make the jobids look like 12345.my_cluster.domain.org
> > > instead?
> >
> > That's the server_name server attribute that you can set with qmgr.
>
> Thanks, that did the trick.
>
> The server_name seems to be restricted to valid hostnames, any reason for
> that? Does it have any meaning or is it just a string attached to the
> jobnumber?
>
Setting the server_name seems to create problems when one wants to do use the
jobid for something. For instance, try query a job with qstat -f. Using the
example above:
I have set server_name = my_cluster.domain.org (which is an alias for the
login nodes that use ip takeover), the pbs_server is running on
my_cluster_core.domain.org.
[root at my_cluster_core named]# qstat -a
my_cluster.domain.org:
Req'd
Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time
S Time
-------------------- -------- -------- ---------- ------ ----- --- ------ ----- - -----
31.my_cluster.domain.org royd default STDIN --
1 -- -- 02:46 R --
[root at my_cluster_core named]# qstat -f 31
qstat: Unknown Job Id 31.my_cluster_core.domain.org
[root at my_cluster_core named]# qstat -f 31 at my_cluster_core.domain.org
qstat: Unknown Job Id 31.my_cluster_core.domain.org
[root at my_cluster_core named]# qstat -f 31.my_cluster.domain.org
Connection refused
qstat: cannot connect to server my_cluster.domain.org (errno=111)
However using jobid at real-pbs-server-name works:
qstat -f 31.my_cluster.domain.org at my_cluster_core.domain.org
gives the desired result, but I do not want to expose my users to this.
Any thoughts on how to fix this?
r.
--
The Computer Center, University of Tromsø, N-9037 TROMSØ Norway.
phone:+47 77 64 41 07, fax:+47 77 64 41 00
Roy Dragseth, Team Leader, High Performance Computing
Direct call: +47 77 64 62 56. email: royd at cc.domain.org
More information about the torqueusers
mailing list