[Mauiusers] [torqueusers] Jobs going into incorrect queue

Steve Young chemadm at hamilton.edu
Wed Apr 22 10:49:39 MDT 2009


Hi Phillip,
	Ah I see... yea first glance it looks like it *should* work =). I'm  
using routing queue's but they aren't based on walltime so not sure if  
I have any good suggestions. The routing queue's I have setup work as  
expected. What happens when you try submitting a job to each of the  
execution queue's? I'd think you should get rejected on the short_2h?

My point before was to understand why you'd want to let them default  
to a large amount of time instead of making it smaller so it finishes  
quick and they figure out they need to put in a proper walltime. If I  
queue up something that takes a month to run but forget to put in  
walltime I wouldn't know for two weeks. Then when it was killed off by  
the system I'd have to start again with the proper walltime thus  
taking a month to get back to where I was when it ended prematurely.  
Anyhow, hope this helps.

-Steve


On Apr 22, 2009, at 9:16 AM, Philip Peartree wrote:

> Steve, you seem to have miss understood, I have a default walltime
> set, at 2 weeks (336 hours), and therefore the job should go into the
> unspec queue, but instead, it is going to the short_2h queue, where it
> shouldn't be able to run (since the max queue walltime 2h)
>
> I have included the full output of print server:
>
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue short_2h
> #
> create queue short_2h
> set queue short_2h queue_type = Execution
> set queue short_2h Priority = 50
> set queue short_2h resources_max.walltime = 02:00:00
> set queue short_2h acl_group_enable = True
> set queue short_2h acl_groups = nmrc
> set queue short_2h enabled = True
> set queue short_2h started = True
> #
> # Create and define queue guest
> #
> create queue guest
> set queue guest queue_type = Execution
> set queue guest Priority = 10
> set queue guest enabled = True
> set queue guest started = True
> #
> # Create and define queue long_1w
> #
> create queue long_1w
> set queue long_1w queue_type = Execution
> set queue long_1w Priority = 30
> set queue long_1w resources_max.walltime = 168:00:00
> set queue long_1w acl_group_enable = True
> set queue long_1w acl_groups = nmrc
> set queue long_1w enabled = True
> set queue long_1w started = True
> #
> # Create and define queue med_12h
> #
> create queue med_12h
> set queue med_12h queue_type = Execution
> set queue med_12h Priority = 40
> set queue med_12h resources_max.walltime = 12:00:00
> set queue med_12h acl_group_enable = True
> set queue med_12h acl_groups = nmrc
> set queue med_12h enabled = True
> set queue med_12h started = True
> #
> # Create and define queue route
> #
> create queue route
> set queue route queue_type = Route
> set queue route route_destinations = short_2h
> set queue route route_destinations += med_12h
> set queue route route_destinations += long_1w
> set queue route route_destinations += unspec
> set queue route route_destinations += guest
> set queue route enabled = True
> set queue route started = True
> #
> # Create and define queue unspec
> #
> create queue unspec
> set queue unspec queue_type = Execution
> set queue unspec Priority = 20
> set queue unspec acl_group_enable = True
> set queue unspec acl_groups = nmrc
> set queue unspec enabled = True
> set queue unspec started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_hosts = steel
> set server managers = root at steel.mib.man.ac.uk
> set server operators = root at steel.mib.man.ac.uk
> set server default_queue = route
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server resources_default.walltime = 336:00:00
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server queue_centric_limits = True
> set server mom_job_sync = True
> set server keep_completed = 300
> set server next_job_number = 9066
>
>
> Thanks
>
> Phil
>
>
> Quoting Steve Young <chemadm at hamilton.edu>:
>
>> Hi,
>> 	I use a server default for torque.....
>>
>> set server resources_default.walltime = 24:00:00
>>
>> This way if they don't specify anything they will default to 24
>> hours.  I took the approach that if the user doesn't specify anything
>> that they should get a minimal amount of queue time. With this I  
>> don't
>> have to have a queue to handle unspecified. I'd rather have their job
>> finish fairly quick and realize they didn't specify a time than to
>> have them go for days/weeks before they realized they didn't specify
>> it. I'd hate to have a job run for two weeks and then end up getting
>> killed off because I didn't specify my time. Especially for a job  
>> that
>> can't pick up where it left off and has to start from the beginning
>> again. Seems like a waste of resources to me. Not sure if this helps
>> you any. Could you send the output of the rest of the qmgr output?
>> It's hard to tell why it's getting to the unspec queue if we can't  
>> see
>> the config for it.
>>
>> -Steve
>>
>>
>>
>> On Apr 21, 2009, at 1:06 PM, Philip Peartree wrote:
>>
>>> The default queue is the routing queue, which should place the job
>>> based on allowed time, that is why it's so puzzling that the jobs  
>>> end
>>> up in the short_2h queue, as they should be rejected by that and
>>> others until it reaches the unspec queue.
>>>
>>>
>>> Quoting "Greenseid, Joseph M (IS)" <Joseph.Greenseid at ngc.com>:
>>>
>>>> have you tried to set the default queue (set server default_queue =
>>>> unspec) in qmgr?  this is how i route jobs that don't specify
>>>> resources to a default location...
>>>>
>>>> --Joe
>>>>
>>>> ________________________________
>>>>
>>>> From: mauiusers-bounces at supercluster.org on behalf of Philip  
>>>> Peartree
>>>> Sent: Tue 4/21/2009 12:32 PM
>>>> To: torqueusers at supercluster.org; mauiusers at supercluster.org
>>>> Subject: [Mauiusers] Jobs going into incorrect queue
>>>>
>>>>
>>>>
>>>> Hi Guys
>>>>
>>>> I have a problem that jobs appear to be not routing to the correct
>>>> queue. My set up is as follows:
>>>>
>>>> routing queue
>>>> 2h queue
>>>> 12h queue
>>>> 1w queue
>>>> unspecified time queue (max time 2w)
>>>> guest queue (low priority)
>>>>
>>>> If a time is unspecified at job submission a default time of 2w
>>>> (336h) is set
>>>>
>>>> The routing queue is setup as follows (as taken from qmgr -c 'print
>>>> server')
>>>>
>>>> create queue route
>>>> set queue route queue_type = Route
>>>> set queue route route_destinations = short_2h
>>>> set queue route route_destinations += med_12h
>>>> set queue route route_destinations += long_1w
>>>> set queue route route_destinations += unspec
>>>> set queue route route_destinations += guest
>>>> set queue route enabled = True
>>>> set queue route started = True
>>>>
>>>> my problem is that some jobs with unspecified time (which have
>>>> correctly been given a time of 336h) are ending up in the short_2h
>>>> queue, which has a higher priority than other queues. Does anyone
>>>> know
>>>> of any possible explanation for this?
>>>>
>>>> Phil Peartree
>>>> University of Manchester
>>>>
>>>> _______________________________________________
>>>> mauiusers mailing list
>>>> mauiusers at supercluster.org
>>>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> mauiusers mailing list
>>> mauiusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>
>> _______________________________________________
>> mauiusers mailing list
>> mauiusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the mauiusers mailing list