[torqueusers] qsub: cannot connect to server one.server.com (errno=15007)

Prakash Velayutham velayups at email.uc.edu
Wed Jul 27 13:30:57 MDT 2005


Engineering Dept. wrote:

>On 7/27/05, Prakash Velayutham <velayups at email.uc.edu> wrote:
>  
>
>>Engineering Dept. wrote:
>>
>>    
>>
>>>Thanks for the response.
>>>
>>>I have been trying to submit jobs via the server itself. However, if I
>>>try to submit jobs from one of the nodes, I get the same error.
>>>
>>>thanks
>>>
>>>On 7/27/05, Prakash Velayutham <velayups at email.uc.edu> wrote:
>>>
>>>
>>>      
>>>
>>>>Engineering Dept. wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>Hi all,
>>>>>
>>>>>I'm new to this (obviously), and am having trouble submitting jobs to
>>>>>the torque scheduler.
>>>>>First off, I've read through the Admin Guide, User Guide, Quickstart
>>>>>guide, Troubleshooting guide, this mailing list archive, and about
>>>>>1000 seearch results from Google. So, I've done some research on this
>>>>>problem and still I can't seem to get it resolved.
>>>>>
>>>>>Here is the error I get:
>>>>>--------
>>>>>user at one [~]$time strace -f -o /tmp/1.strace qsub -l nodes=1 simple-test
>>>>>pbs_iff: cannot connect to one.server.com:15001 - fatal error,
>>>>>errno=13 (Permission denied)
>>>>>No Permission.
>>>>>qsub: cannot connect to server one.server.com (errno=15007)
>>>>>
>>>>>real    0m0.039s
>>>>>user    0m0.007s
>>>>>sys     0m0.026s
>>>>>--------
>>>>>
>>>>>As far as I can tell I'm trying to keep things as simple as possible
>>>>>in order to just evaluate torque at this point.
>>>>>
>>>>>I have one server and a cluster of only two nodes (for now). The
>>>>>Server is running the pbs_server and scheduler, and the nodes are
>>>>>running pbs_mom. Everything "appears to be created and configured
>>>>>properly.
>>>>>
>>>>>Anyone have a clue as to what is causing this or where I should look?
>>>>>
>>>>>Everything appears to be configured correctly as far as I can
>>>>>tell...any help would be appreciated.
>>>>>
>>>>>thanks
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>Which system are you trying to submit job from? If it is a cluster node
>>>>instead of the server, you will need to add ACLs for the nodes using
>>>>qmgr on the server (as root). If you are already trying to submit the
>>>>job from the server side, we have to dig deeper.
>>>>
>>>>Prakash
>>>>
>>>>        
>>>>
>>Hi,
>>
>>Please do not top post. Also keep the discussion within the mailing list
>>AFAP, so it may help others when they search the list.
>>Could you post the ouput of your "qmgr -c 'p s'" here? And also the
>>names of your server and cluster nodes.
>>
>>Prakash
>>_______________________________________________
>>torqueusers mailing list
>>torqueusers at supercluster.org
>>http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>>    
>>
>
>Hi, sorry about the top post...hopefully this message will post at the
>bottom here...
>
>Here is the output of: qmgr -c 'p s'
>#
># Create queues and set their attributes.
>#
>#
># Create and define queue batch
>#
>create queue batch
>set queue batch queue_type = Execution
>set queue batch resources_default.nodes = 1
>set queue batch resources_default.walltime = 01:00:00
>set queue batch enabled = True
>set queue batch started = True
>#
># Set server attributes.
>#
>set server scheduling = True
>set server acl_host_enable = True
>set server acl_hosts = *.server.com
>set server acl_users = *@*.server.com
>set server managers = root at one.server.com
>set server operators = root at one.server.com
>set server default_queue = batch
>set server log_events = 511
>set server mail_from = adm
>set server scheduler_iteration = 600
>set server node_ping_rate = 300
>set server node_check_rate = 600
>set server tcp_timeout = 6
>set server job_stat_rate = 30
>
>Also here is the server name: one.server.com
>Node names: node1.server.com, node2.server.com
>
>thanks
>
1) See if qmgr can be executed as the user from the server without errors
2) Could you try and make these changes and see if qstat works after 
this? (Please note the - signs in the directives carefully)
set server acl_user_enable=true
set server acl_hosts -= *.server.com
set server acl_hosts=one.server.com
set server acl_users -= *@*.server.com
set server acl_users = <specific_username>@one.server.com

Prakash


More information about the torqueusers mailing list