[torqueusers] Problems with qmgr

Saurabh Barve sbarve at nps.edu
Tue Aug 28 12:17:51 MDT 2007


>> Due to a dual-NIC setup, there is a conflict between the hostname on the
>> internal network (head: 172.16.100.1) and the one on the external network
>> (wilma: 172.20.*.*). As a result the 'qmgr' command returns error messages
>> for my commands.
>> 
> 
> Is 'wilma' (172.20.*.*) associated with the first network-device (eth0) and
> 'head' (172.16.100.1) with the second (e.g. eth1)?


No. It is the other way round. The external IP address ('wilma':172.20.*) is
associated with eth1 and the internal IP address ('head': 172.16.*) is
associated with eth0.
 
> Then you should try to start the pbs_server with this extension:
> 
> #> pbs_server -S wilma:15004
The 'pbsnodes -a' command gives me reasonable output:
==========
[root at wilma ~]# pbsnodes -a
head
     state = free
     np = 8
     ntype = cluster
     status = opsys=linux,uname=Linux wilma 2.6.9-55.ELlargesmp #1 SMP Wed
...
...
========== 

> Please, don't forget to restart 'maui' after you started 'pbs_server',
> sometimes this solves a strange behaviour ...

I tried to change the SERVERHOST variable in maui.cfg to 'head', but then
the maui service wouldn't start:

==========
[root at wilma ~]# service maui start
Starting maui: ERROR:    server must be started on host 'head' (currently on
'wilma.<snipped>')
                                                           [FAILED]
==========


I still get errors when I try to use 'head' in the qmgr commands:
----------------------
Qmgr: set server tcp_timeout=5
qmgr obj= svr=default: Unauthorized Request
Qmgr: set head tcp_timeout=5
qmgr: Illegal object type: head.
Qmgr: set server head tcp_timeout=5
qmgr obj=head svr=head: Unauthorized Request
---------------------


But using 'wilma' seems to work:
---------------------
Qmgr: set server wilma tcp_timeout=5
Qmgr: list server
<snipped>
Server wilma
        server_state = Active
        scheduling = True
        <snipped>
        tcp_timeout = 5
        pbs_version = 2.1.2
...
...
Qmgr: create queue mtm at wilma
Qmgr: list queue
Queue mtm
        total_jobs = 0
        state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
Exiting:0 
        mtime = Tue Aug 28 11:04:45 2007
---------------------

So I restarted maui by resetting SERVERHOST to 'wilma'. It started without
errors.

But once I quit 'qmgr', the queue information is not saved. I set active the
default 'batch' queue, but my qsub based job wouldn't run. When I went back
into qmgr, no active queues are displayed. There does not seem to be a
'save' option for 'qmgr'.

-Saurabh
-- 
Saurabh Barve
sbarve at nps.edu
831-656-3396



More information about the torqueusers mailing list