[torqueusers] momctl error - A correction

michael young mhyoung at valdosta.edu
Mon Mar 5 14:15:04 MST 2007


Jerry Smith wrote:

>Michael,
>
>Do you know much of the layout of the cluster ie...  How many compute nodes,
>and their naming scheme?  Where the pbs-server runs etc...
>Are you running maui or moab or just pbs_server pbs_scheduler.
>
>On your master node can you run.
>
>Qmgr -c "p s" 
>  
>
[root at cluster torque]# qmgr -c "p s"
#
# Create queues and set their attributes.
#
#
# Create and define queue default
#
create queue default
set queue default queue_type = Execution
set queue default enabled = True
set queue default started = True
#
# Set server attributes.
#
set server scheduling = True
set server managers = root at cluster.chemistry.valdosta.edu
set server operators = root at cluster.chemistry.valdosta.edu
set server default_queue = default
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6

>And 
>Qmgr -c "l s" 
>  
>
[root at cluster torque]# qmgr -c "l s"
Server cluster.chemistry.valdosta.edu
        server_state = Active
        scheduling = True
        total_jobs = 0
        state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 
Exiting:0
        managers = root at cluster.chemistry.valdosta.edu
        operators = root at cluster.chemistry.valdosta.edu
        default_queue = default
        log_events = 511
        mail_from = adm
        scheduler_iteration = 600
        node_check_rate = 150
        tcp_timeout = 6
        pbs_version = 2.0.0p4

>And copy it here.  Hopefully I can help ya out.
>
>
>  
>
>>From: michael young <mhyoung at valdosta.edu>
>>Date: Mon, 05 Mar 2007 11:26:37 -0500
>>To: Jerry Smith <jdsmit at sandia.gov>, <torqueusers at supercluster.org>
>>Subject: Re: [torqueusers] momctl error - A correction
>>
>>Jerry Smith wrote:
>>
>>    
>>
>>>>Hi,
>>>>I messed up my last email.
>>>>I meant to write this.
>>>>
>>>>I've got torque running on a linux cluster.
>>>>On the master node I run this command 'momctl -d 3'.
>>>>I get this message.
>>>>ERROR:    query[0] 'diag3' failed on localhost (errno: 0:5)
>>>>   
>>>>
>>>>        
>>>>
>>>Is your master node running a pbs_mom, or just pbs_server
>>> 
>>>
>>>      
>>>
>>How do I tell?
>>Someone else setup the cluster then left.
>>I got tasked with finding out why it's not working.
>>
>>    
>>
>>> 
>>>
>>>      
>>>
>>>>Any ideas on what this means?
>>>>apologies for the mistake in my last email.
>>>>   
>>>>
>>>>        
>>>>
>>>Not a problem at all.
>>>
>>>
>>>Jerry
>>>
>>>
>>> 
>>>
>>>      
>>>
>
>
>  
>


More information about the torqueusers mailing list