[torqueusers] momctl error - A correction
michael young
mhyoung at valdosta.edu
Mon Mar 5 14:15:04 MST 2007
Jerry Smith wrote:
>Michael,
>
>Do you know much of the layout of the cluster ie... How many compute nodes,
>and their naming scheme? Where the pbs-server runs etc...
>Are you running maui or moab or just pbs_server pbs_scheduler.
>
>On your master node can you run.
>
>Qmgr -c "p s"
>
>
[root at cluster torque]# qmgr -c "p s"
#
# Create queues and set their attributes.
#
#
# Create and define queue default
#
create queue default
set queue default queue_type = Execution
set queue default enabled = True
set queue default started = True
#
# Set server attributes.
#
set server scheduling = True
set server managers = root at cluster.chemistry.valdosta.edu
set server operators = root at cluster.chemistry.valdosta.edu
set server default_queue = default
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
>And
>Qmgr -c "l s"
>
>
[root at cluster torque]# qmgr -c "l s"
Server cluster.chemistry.valdosta.edu
server_state = Active
scheduling = True
total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
Exiting:0
managers = root at cluster.chemistry.valdosta.edu
operators = root at cluster.chemistry.valdosta.edu
default_queue = default
log_events = 511
mail_from = adm
scheduler_iteration = 600
node_check_rate = 150
tcp_timeout = 6
pbs_version = 2.0.0p4
>And copy it here. Hopefully I can help ya out.
>
>
>
>
>>From: michael young <mhyoung at valdosta.edu>
>>Date: Mon, 05 Mar 2007 11:26:37 -0500
>>To: Jerry Smith <jdsmit at sandia.gov>, <torqueusers at supercluster.org>
>>Subject: Re: [torqueusers] momctl error - A correction
>>
>>Jerry Smith wrote:
>>
>>
>>
>>>>Hi,
>>>>I messed up my last email.
>>>>I meant to write this.
>>>>
>>>>I've got torque running on a linux cluster.
>>>>On the master node I run this command 'momctl -d 3'.
>>>>I get this message.
>>>>ERROR: query[0] 'diag3' failed on localhost (errno: 0:5)
>>>>
>>>>
>>>>
>>>>
>>>Is your master node running a pbs_mom, or just pbs_server
>>>
>>>
>>>
>>>
>>How do I tell?
>>Someone else setup the cluster then left.
>>I got tasked with finding out why it's not working.
>>
>>
>>
>>>
>>>
>>>
>>>
>>>>Any ideas on what this means?
>>>>apologies for the mistake in my last email.
>>>>
>>>>
>>>>
>>>>
>>>Not a problem at all.
>>>
>>>
>>>Jerry
>>>
>>>
>>>
>>>
>>>
>>>
>
>
>
>
More information about the torqueusers
mailing list