[torqueusers] qsub: cannot connect to server one.server.com (errno=15007)

Engineering Dept. engcore at gmail.com
Wed Jul 27 11:17:15 MDT 2005


On 7/27/05, Prakash Velayutham <velayups at email.uc.edu> wrote:
> Engineering Dept. wrote:
> 
> >Thanks for the response.
> >
> >I have been trying to submit jobs via the server itself. However, if I
> >try to submit jobs from one of the nodes, I get the same error.
> >
> >thanks
> >
> >On 7/27/05, Prakash Velayutham <velayups at email.uc.edu> wrote:
> >
> >
> >>Engineering Dept. wrote:
> >>
> >>
> >>
> >>>Hi all,
> >>>
> >>>I'm new to this (obviously), and am having trouble submitting jobs to
> >>>the torque scheduler.
> >>>First off, I've read through the Admin Guide, User Guide, Quickstart
> >>>guide, Troubleshooting guide, this mailing list archive, and about
> >>>1000 seearch results from Google. So, I've done some research on this
> >>>problem and still I can't seem to get it resolved.
> >>>
> >>>Here is the error I get:
> >>>--------
> >>>user at one [~]$time strace -f -o /tmp/1.strace qsub -l nodes=1 simple-test
> >>>pbs_iff: cannot connect to one.server.com:15001 - fatal error,
> >>>errno=13 (Permission denied)
> >>>No Permission.
> >>>qsub: cannot connect to server one.server.com (errno=15007)
> >>>
> >>>real    0m0.039s
> >>>user    0m0.007s
> >>>sys     0m0.026s
> >>>--------
> >>>
> >>>As far as I can tell I'm trying to keep things as simple as possible
> >>>in order to just evaluate torque at this point.
> >>>
> >>>I have one server and a cluster of only two nodes (for now). The
> >>>Server is running the pbs_server and scheduler, and the nodes are
> >>>running pbs_mom. Everything "appears to be created and configured
> >>>properly.
> >>>
> >>>Anyone have a clue as to what is causing this or where I should look?
> >>>
> >>>Everything appears to be configured correctly as far as I can
> >>>tell...any help would be appreciated.
> >>>
> >>>thanks
> >>>
> >>>
> >>>
> >>Which system are you trying to submit job from? If it is a cluster node
> >>instead of the server, you will need to add ACLs for the nodes using
> >>qmgr on the server (as root). If you are already trying to submit the
> >>job from the server side, we have to dig deeper.
> >>
> >>Prakash
> >>
> Hi,
> 
> Please do not top post. Also keep the discussion within the mailing list
> AFAP, so it may help others when they search the list.
> Could you post the ouput of your "qmgr -c 'p s'" here? And also the
> names of your server and cluster nodes.
> 
> Prakash
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 

Hi, sorry about the top post...hopefully this message will post at the
bottom here...

Here is the output of: qmgr -c 'p s'
#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 01:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_host_enable = True
set server acl_hosts = *.server.com
set server acl_users = *@*.server.com
set server managers = root at one.server.com
set server operators = root at one.server.com
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_ping_rate = 300
set server node_check_rate = 600
set server tcp_timeout = 6
set server job_stat_rate = 30

Also here is the server name: one.server.com
Node names: node1.server.com, node2.server.com

thanks


More information about the torqueusers mailing list