[torqueusers] Problems with qmgr

Thomas Dargel td at chemie.hu-berlin.de
Wed Aug 29 07:06:50 MDT 2007


What is the output of

   hostname

and

   hostname -f

Do you have entries for the IP-addresses/hostnames of 'head' and 'wilma' in 
/etc/hosts??

Greets,

Thomas.

Saurabh Barve wrote:
>>> Due to a dual-NIC setup, there is a conflict between the hostname on the
>>> internal network (head: 172.16.100.1) and the one on the external network
>>> (wilma: 172.20.*.*). As a result the 'qmgr' command returns error messages
>>> for my commands.
>>>
>> Is 'wilma' (172.20.*.*) associated with the first network-device (eth0) and
>> 'head' (172.16.100.1) with the second (e.g. eth1)?
> 
> 
> No. It is the other way round. The external IP address ('wilma':172.20.*) is
> associated with eth1 and the internal IP address ('head': 172.16.*) is
> associated with eth0.
>  
>> Then you should try to start the pbs_server with this extension:
>>
>> #> pbs_server -S wilma:15004
> The 'pbsnodes -a' command gives me reasonable output:
> ==========
> [root at wilma ~]# pbsnodes -a
> head
>      state = free
>      np = 8
>      ntype = cluster
>      status = opsys=linux,uname=Linux wilma 2.6.9-55.ELlargesmp #1 SMP Wed
> ...
> ...
> ========== 
> 
>> Please, don't forget to restart 'maui' after you started 'pbs_server',
>> sometimes this solves a strange behaviour ...
> 
> I tried to change the SERVERHOST variable in maui.cfg to 'head', but then
> the maui service wouldn't start:
> 
> ==========
> [root at wilma ~]# service maui start
> Starting maui: ERROR:    server must be started on host 'head' (currently on
> 'wilma.<snipped>')
>                                                            [FAILED]
> ==========
> 
> 
> I still get errors when I try to use 'head' in the qmgr commands:
> ----------------------
> Qmgr: set server tcp_timeout=5
> qmgr obj= svr=default: Unauthorized Request
> Qmgr: set head tcp_timeout=5
> qmgr: Illegal object type: head.
> Qmgr: set server head tcp_timeout=5
> qmgr obj=head svr=head: Unauthorized Request
> ---------------------
> 
> 
> But using 'wilma' seems to work:
> ---------------------
> Qmgr: set server wilma tcp_timeout=5
> Qmgr: list server
> <snipped>
> Server wilma
>         server_state = Active
>         scheduling = True
>         <snipped>
>         tcp_timeout = 5
>         pbs_version = 2.1.2
> ...
> ...
> Qmgr: create queue mtm at wilma
> Qmgr: list queue
> Queue mtm
>         total_jobs = 0
>         state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0 
>         mtime = Tue Aug 28 11:04:45 2007
> ---------------------
> 
> So I restarted maui by resetting SERVERHOST to 'wilma'. It started without
> errors.
> 
> But once I quit 'qmgr', the queue information is not saved. I set active the
> default 'batch' queue, but my qsub based job wouldn't run. When I went back
> into qmgr, no active queues are displayed. There does not seem to be a
> 'save' option for 'qmgr'.
> 
> -Saurabh


More information about the torqueusers mailing list