[torqueusers] qsub: Job rejected by all possible destinations

Steve Young chemadm at hamilton.edu
Tue Jan 20 08:27:21 MST 2009


Hi,
	I guess if it were me and it told me it had a problem with my queue  
min nodes requirement that I would probably remove that setting to see  
if it worked. The best way to troubleshoot these kinds of things  
sometimes is to start out with the basic of queue settings and then  
test it. If it works then add another setting and test it until  
finally you get all the settings you want. I don't use max and min  
settings on the queue's here so I can really  say for sure what the  
problem is. I'd also submit to each individual queue in order to test  
it. If each queue doesn't work like you expect then using the routing  
queue isn't going to make it work any better =). Make sure each one  
works then after you are happy with that then test out the routing  
queue. I can be very difficult to try to configure all sorts of  
settings at first and hope that it will work. Start simple and then  
add more configurations after you've tested. I'm sorry this doesn't  
help answer your problem but I hope it can give you some idea's on how  
to solve it.

-Steve


On Jan 19, 2009, at 8:01 PM, Weiguang Chen wrote:

> Hi,
> Thank you for your fast reply.
> In fact, initially, there was not that tag "###PBS -q huge" in my
> submission script. In the beginning, i expected the job would been
> routed automatically form route queue (like default, which is the
> default queue) to the suitable execution queue (like huge). When job
> submission failed, so i added that tag for testing whether job would
> been directly transfered to huge queue, but following message showed:
>
> qsub: Job exceeds queue resource limits MSG=cannot satisfy queue min
> nodes requirement
>
> So, i commented it again.
> I was very confused that message, because i requested 16 nodes, and
> the min nodes was set as 8 (in fact, it should be 9. There are 2 cpus
> in our every node, and i set the min ncpus as 17) and the max nodes
> was set to 16 (the max ncpus is 32) of huge queue . I thought huge
> queue shoule be suitable for my job.
> What needs to explain, all queues are incompatible expect for some
> special queues. Following is the whole settings by command ' qmgr -c
> "p s" ' ( For some reasons ,i removed some information about
> acl_users), i hoped it would be helpful to understanding my problem.
>
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue huge
> #
> create queue huge
> set queue huge queue_type = Execution
> set queue huge Priority = 40
> set queue huge max_queuable = 2
> set queue huge max_user_queuable = 1
> set queue huge max_running = 1
> set queue huge acl_user_enable = True
> set queue huge acl_users = xxx at node1
> set queue huge resources_max.ncpus = 32
> set queue huge resources_max.nodect = 16
> set queue huge resources_max.nodes = 16
> set queue huge resources_max.walltime = 160:00:00
> set queue huge resources_min.ncpus = 17
> set queue huge resources_min.nodect = 8
> set queue huge resources_min.nodes = 8
> set queue huge resources_min.walltime = 00:00:01
> set queue huge resources_default.walltime = 36:00:00
> set queue huge max_user_run = 1
> set queue huge enabled = True
> set queue huge started = True
> #
> # Create and define queue default
> #
> create queue default
> set queue default queue_type = Route
> set queue default max_running = 15
> set queue default route_destinations = tiny
> set queue default route_destinations += verysmall
> set queue default route_destinations += small
> set queue default route_destinations += medium
> set queue default route_destinations += huge
> set queue default route_destinations += train
> set queue default route_destinations += special
> set queue default enabled = True
> set queue default started = True
> #
> # Create and define queue verysmall
> #
> create queue verysmall
> set queue verysmall queue_type = Execution
> set queue verysmall Priority = 120
> set queue verysmall max_queuable = 9
> set queue verysmall max_user_queuable = 3
> set queue verysmall max_running = 7
> set queue verysmall acl_user_enable = True
> set queue verysmall acl_users = xxx at node1
> set queue verysmall resources_max.ncpus = 4
> set queue verysmall resources_max.nodect = 2
> set queue verysmall resources_max.nodes = 2
> set queue verysmall resources_max.walltime = 36:00:00
> set queue verysmall resources_min.ncpus = 3
> set queue verysmall resources_min.nodect = 2
> set queue verysmall resources_min.nodes = 2
> set queue verysmall resources_min.walltime = 00:00:01
> set queue verysmall resources_default.walltime = 24:00:00
> set queue verysmall max_user_run = 2
> set queue verysmall enabled = True
> set queue verysmall started = True
> #
> # Create and define queue tiny
> #
> create queue tiny
> set queue tiny queue_type = Execution
> set queue tiny Priority = 140
> set queue tiny max_queuable = 13
> set queue tiny max_user_queuable = 3
> set queue tiny max_running = 10
> set queue tiny acl_user_enable = True
> set queue tiny acl_users = xxx at node1
> set queue tiny resources_max.ncpus = 2
> set queue tiny resources_max.nodect = 1
> set queue tiny resources_max.nodes = 1
> set queue tiny resources_max.walltime = 36:00:00
> set queue tiny resources_min.ncpus = 1
> set queue tiny resources_min.nodect = 1
> set queue tiny resources_min.nodes = 1
> set queue tiny resources_min.walltime = 00:00:01
> set queue tiny resources_default.walltime = 24:00:00
> set queue tiny max_user_run = 2
> set queue tiny enabled = True
> set queue tiny started = True
> #
> # Create and define queue medium
> #
> create queue medium
> set queue medium queue_type = Execution
> set queue medium Priority = 80
> set queue medium max_queuable = 5
> set queue medium max_user_queuable = 2
> set queue medium max_running = 3
> set queue medium acl_user_enable = True
> set queue medium acl_users = xxx at node1
> set queue medium resources_max.ncpus = 16
> set queue medium resources_max.nodect = 8
> set queue medium resources_max.nodes = 8
> set queue medium resources_max.walltime = 168:00:00
> set queue medium resources_min.ncpus = 9
> set queue medium resources_min.nodect = 5
> set queue medium resources_min.nodes = 5
> set queue medium resources_min.walltime = 00:00:01
> set queue medium resources_default.walltime = 24:00:00
> set queue medium max_user_run = 1
> set queue medium enabled = True
> set queue medium started = True
> #
> # Create and define queue train
> #
> create queue train
> set queue train queue_type = Execution
> set queue train Priority = 160
> set queue train max_queuable = 3
> set queue train max_user_queuable = 3
> set queue train max_running = 2
> set queue train acl_user_enable = True
> set queue train acl_users = phy01 at node1
> set queue train resources_max.ncpus = 2
> set queue train resources_max.nodect = 1
> set queue train resources_max.nodes = 1
> set queue train resources_max.walltime = 36:00:00
> set queue train resources_min.ncpus = 1
> set queue train resources_min.nodect = 1
> set queue train resources_min.nodes = 1
> set queue train resources_min.walltime = 00:00:01
> set queue train resources_default.walltime = 24:00:00
> set queue train max_user_run = 2
> set queue train enabled = True
> set queue train started = True
> #
> # Create and define queue small
> #
> create queue small
> set queue small queue_type = Execution
> set queue small Priority = 100
> set queue small max_queuable = 7
> set queue small max_user_queuable = 3
> set queue small max_running = 5
> set queue small acl_user_enable = True
> set queue small acl_users = xxx at node1
> set queue small resources_max.ncpus = 8
> set queue small resources_max.nodect = 4
> set queue small resources_max.nodes = 4
> set queue small resources_max.walltime = 36:00:00
> set queue small resources_min.ncpus = 5
> set queue small resources_min.nodect = 3
> set queue small resources_min.nodes = 3
> set queue small resources_min.walltime = 00:00:01
> set queue small resources_default.walltime = 24:00:00
> set queue small max_user_run = 2
> set queue small enabled = True
> set queue small started = True
> #
> # Create and define queue special
> #
> create queue special
> set queue special queue_type = Execution
> set queue special Priority = 130
> set queue special max_queuable = 3
> set queue special max_user_queuable = 3
> set queue special max_running = 2
> set queue special acl_user_enable = True
> set queue special acl_users = qxli at node1
> set queue special resources_max.ncpus = 2
> set queue special resources_max.nodect = 1
> set queue special resources_max.nodes = 1
> set queue special resources_max.walltime = 96:00:00
> set queue special resources_min.ncpus = 1
> set queue special resources_min.nodect = 1
> set queue special resources_min.nodes = 1
> set queue special resources_min.walltime = 00:00:01
> set queue special resources_default.walltime = 48:00:00
> set queue special max_user_run = 2
> set queue special enabled = True
> set queue special started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server max_user_run = 10
> set server acl_hosts = node1
> set server default_queue = default
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server next_job_number = 2423
>
> Thank you
> Sincerely
>
> Chen Weiguang
>
> On Mon, Jan 19, 2009 at 8:32 PM, Steve Young <chemadm at hamilton.edu>  
> wrote:
>> Hi,
>>       Ok I understand better .. you are using a routing queue =).  
>> In your
>> first e-mail did you un-comment the ###PBS -q huge  and see if that  
>> worked?
>> Since it is commented out your going to the "default" routing queue  
>> since no
>> queue is specified. For some reason, it's thinking there isn't any  
>> place to
>> route it too. So I'd try making sure the huge queue works like you  
>> expect
>> first then try using the routing queue. Hope this helps,
>>
>> -Steve
>>
>> On Jan 19, 2009, at 12:12 PM, Weiguang Chen wrote:
>>
>>> Hi,
>>> Thank you very much for your reply.
>>> What i was confused is the settings about huge basically is  
>>> similar to
>>> the other queues, such as below:
>>>
>>> set queue default route_destinations += medium
>>> # Create and define queue medium
>>> create queue medium
>>> set queue medium queue_type = Execution
>>> set queue medium Priority = 80
>>> set queue medium max_queuable = 5
>>> set queue medium max_user_queuable = 2
>>> set queue medium max_running = 3
>>> set queue medium acl_user_enable = True
>>> set queue medium acl_users = xxx at node1
>>> set queue medium resources_max.ncpus = 16
>>> set queue medium resources_max.nodect = 8
>>> set queue medium resources_max.nodes = 8
>>> set queue medium resources_max.walltime = 168:00:00
>>> set queue medium resources_min.ncpus = 9
>>> set queue medium resources_min.nodect = 5
>>> set queue medium resources_min.nodes = 5
>>> set queue medium resources_min.walltime = 00:00:01
>>> set queue medium resources_default.walltime = 24:00:00
>>> set queue medium max_user_run = 1
>>> set queue medium enabled = True
>>> set queue medium started = True
>>>
>>> But this queue works well. The other settings i set are used to  
>>> route
>>> different kinds of job to the appropriate queue.
>>> According to the script of submitted job, i thought it conform the
>>> policy of huge queue.
>>> Now, the job can been submitted to the default queue, but can not  
>>> been
>>> routed to the huge queue. below is the settings about default  
>>> queue (
>>> if queue isn't given by the users, jobs will be routed to default
>>> queue):
>>> create queue default
>>> set queue default queue_type = Route
>>> set queue default max_running = 15
>>> set queue default route_destinations = tiny
>>> set queue default route_destinations += verysmall
>>> set queue default route_destinations += small
>>> set queue default route_destinations += medium
>>> set queue default route_destinations += huge
>>> set queue default route_destinations += train
>>> set queue default route_destinations += special
>>> set queue default enabled = True
>>> set queue default started = True
>>> create queue default
>>> set queue default queue_type = Route
>>> set queue default max_running = 15
>>> set queue default route_destinations = tiny
>>> set queue default route_destinations += verysmall
>>> set queue default route_destinations += small
>>> set queue default route_destinations += medium
>>> set queue default route_destinations += huge
>>> set queue default route_destinations += train
>>> set queue default route_destinations += special
>>> set queue default enabled = True
>>> set queue default started = True
>>> set server default_queue = default
>>>
>>> Happy Spring Festival (Chinese New Year, 牛年)
>>>
>>> ChenWeiguang
>>>
>>> On Mon, Jan 19, 2009 at 6:14 PM, Steve Young  
>>> <chemadm at hamilton.edu> wrote:
>>>>
>>>> Hi,
>>>>     I'm guessing that this line is messing you up:
>>>>
>>>>> set queue default route_destinations += huge
>>>>
>>>> The queue you have defined "huge" is not a routing queue it is an
>>>> execution
>>>> queue. I'd remove that. I might also remove a bunch of the other  
>>>> settings
>>>> you have to start out with the basic's then add in the ones you  
>>>> want one
>>>> at
>>>> a time so you can test to make sure they work. Hope this helps,
>>>>
>>>> -Steve
>>>>
>>>>
>>>>
>>>> On Jan 17, 2009, at 10:11 AM, Weiguang Chen wrote:
>>>>
>>>>> Hi,
>>>>> I noticed this question was asked and the URL is
>>>>>
>>>>>
>>>>> http://www.clusterresources.com/pipermail/torqueusers/2008-January/006698.html
>>>>> But my trouble is difference from that. I want to submit a huge  
>>>>> job:
>>>>> #!/bin/bash
>>>>> #PBS -N N-top
>>>>> ###PBS -q huge
>>>>> #PBS -o N-top.out
>>>>> #PBS -e N-top.err
>>>>> #PBS -l nodes=16:ppn=2,walltime=160:00:00
>>>>>
>>>>> and the queue huge is set by following:
>>>>> # Create and define queue huge
>>>>> create queue huge
>>>>> set queue huge queue_type = Execution
>>>>> set queue huge Priority = 40
>>>>> set queue huge max_queuable = 2
>>>>> set queue huge max_user_queuable = 1
>>>>> set queue huge max_running = 1
>>>>> set queue huge acl_user_enable = True
>>>>> set queue huge acl_users = xxx at node1
>>>>> set queue huge resources_max.ncpus = 32
>>>>> set queue huge resources_max.nodect = 16
>>>>> set queue huge resources_max.nodes = 16
>>>>> set queue huge resources_max.walltime = 160:00:00
>>>>> set queue huge resources_min.ncpus = 17
>>>>> set queue huge resources_min.nodect = 8
>>>>> set queue huge resources_min.nodes = 8
>>>>> set queue huge resources_min.walltime = 00:00:01
>>>>> set queue huge resources_default.walltime = 36:00:00
>>>>> set queue huge max_user_run = 1
>>>>> set queue huge enabled = True
>>>>> set queue huge started = True
>>>>> set queue default route_destinations += huge
>>>>>
>>>>> The message showed as the title while i submitted it. I checked  
>>>>> the log:
>>>>> 01/17/2009 22:40:39;0100;PBS_Server;Job;2389.node1;enqueuing into
>>>>> default, state 1 hop 1
>>>>> 01/17/2009 22:40:39;0008;PBS_Server;Job;2389.node1;Job rejected  
>>>>> by all
>>>>> possible destinations
>>>>> 01/17/2009 22:40:39;0100;PBS_Server;Job;2389.node1;dequeuing from
>>>>> default, state QUEUED
>>>>> 01/17/2009 22:40:39;0080;PBS_Server;Req;req_reject;Reject reply
>>>>> code=15039(Job rejected by all possible destinations), aux=0,
>>>>> type=Commit, from xxx at node1
>>>>> 01/17/2009 22:40:39;0040;PBS_Server;Svr;node1;Scheduler sent  
>>>>> command
>>>>> term
>>>>>
>>>>> It confused me very much.
>>>>> --
>>>>> Best Wishes
>>>>> ChenWeiguang
>>>>>
>>>>> ************************************************
>>>>> #               Chen, Weiguang
>>>>> #
>>>>> #    Postgraduate,  Ph. D
>>>>> #  75 University Road, Physics Buliding  #  218
>>>>> #  School of Physics & Engineering
>>>>> #  Zhengzhou University
>>>>> #  Zhengzhou, Henan 450052  CHINA
>>>>> #
>>>>> #  Tel: 86-13203730117;
>>>>> #  E-mail:chenweiguang82 at gmail.com;
>>>>> #            chenweiguang82 at qq.com
>>>>> #**********************************************
>>>>> _______________________________________________
>>>>> torqueusers mailing list
>>>>> torqueusers at supercluster.org
>>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes
>>> ChenWeiguang
>>>
>>> ************************************************
>>> #               Chen, Weiguang
>>> #
>>> #    Postgraduate,  Ph. D
>>> #  75 University Road, Physics Buliding  #  218
>>> #  School of Physics & Engineering
>>> #  Zhengzhou University
>>> #  Zhengzhou, Henan 450052  CHINA
>>> #
>>> #  Tel: 86-13203730117;
>>> #  E-mail:chenweiguang82 at gmail.com;
>>> #            chenweiguang82 at qq.com
>>> #**********************************************
>>
>>
>
>
>
> -- 
> Best Wishes
> ChenWeiguang
>
> ************************************************
> #               Chen, Weiguang
> #
> #    Postgraduate,  Ph. D
> #  75 University Road, Physics Buliding  #  218
> #  School of Physics & Engineering
> #  Zhengzhou University
> #  Zhengzhou, Henan 450052  CHINA
> #
> #  Tel: 86-13203730117;
> #  E-mail:chenweiguang82 at gmail.com;
> #            chenweiguang82 at qq.com
> #**********************************************



More information about the torqueusers mailing list