[torqueusers] Problems with qmgr
Saurabh Barve
sbarve at nps.edu
Tue Aug 28 12:17:51 MDT 2007
>> Due to a dual-NIC setup, there is a conflict between the hostname on the
>> internal network (head: 172.16.100.1) and the one on the external network
>> (wilma: 172.20.*.*). As a result the 'qmgr' command returns error messages
>> for my commands.
>>
>
> Is 'wilma' (172.20.*.*) associated with the first network-device (eth0) and
> 'head' (172.16.100.1) with the second (e.g. eth1)?
No. It is the other way round. The external IP address ('wilma':172.20.*) is
associated with eth1 and the internal IP address ('head': 172.16.*) is
associated with eth0.
> Then you should try to start the pbs_server with this extension:
>
> #> pbs_server -S wilma:15004
The 'pbsnodes -a' command gives me reasonable output:
==========
[root at wilma ~]# pbsnodes -a
head
state = free
np = 8
ntype = cluster
status = opsys=linux,uname=Linux wilma 2.6.9-55.ELlargesmp #1 SMP Wed
...
...
==========
> Please, don't forget to restart 'maui' after you started 'pbs_server',
> sometimes this solves a strange behaviour ...
I tried to change the SERVERHOST variable in maui.cfg to 'head', but then
the maui service wouldn't start:
==========
[root at wilma ~]# service maui start
Starting maui: ERROR: server must be started on host 'head' (currently on
'wilma.<snipped>')
[FAILED]
==========
I still get errors when I try to use 'head' in the qmgr commands:
----------------------
Qmgr: set server tcp_timeout=5
qmgr obj= svr=default: Unauthorized Request
Qmgr: set head tcp_timeout=5
qmgr: Illegal object type: head.
Qmgr: set server head tcp_timeout=5
qmgr obj=head svr=head: Unauthorized Request
---------------------
But using 'wilma' seems to work:
---------------------
Qmgr: set server wilma tcp_timeout=5
Qmgr: list server
<snipped>
Server wilma
server_state = Active
scheduling = True
<snipped>
tcp_timeout = 5
pbs_version = 2.1.2
...
...
Qmgr: create queue mtm at wilma
Qmgr: list queue
Queue mtm
total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
Exiting:0
mtime = Tue Aug 28 11:04:45 2007
---------------------
So I restarted maui by resetting SERVERHOST to 'wilma'. It started without
errors.
But once I quit 'qmgr', the queue information is not saved. I set active the
default 'batch' queue, but my qsub based job wouldn't run. When I went back
into qmgr, no active queues are displayed. There does not seem to be a
'save' option for 'qmgr'.
-Saurabh
--
Saurabh Barve
sbarve at nps.edu
831-656-3396
More information about the torqueusers
mailing list