[torqueusers] [Mauiusers] Jobs going into incorrect queue

Steve Young chemadm at hamilton.edu
Thu Apr 23 07:46:07 MDT 2009


Hi,
	I'm wondering about this from the manual:

>> In either mode, queue defaults override the server defaults.

I have walltimes set on each of my queue's:

set queue altix resources_default.walltime = 24:00:00

I'm wondering if you specified a default for each queue what would  
happen. Of course, I'm thinking that as Michel pointed out you might  
have to put unspec first to catch the case where the user doesn't  
specify? Then they end up in unspec with a walltime default of 336.  
Overall it sounds like queue_centric_limits isn't working properly  
based on the Admin manual description. Not sure if this will help but  
I hope it does.

-Steve



On Apr 22, 2009, at 6:47 PM, Philip Peartree wrote:

> I have tested my system out, and it appears that if the time is not
> specified, then it gets routed to the first queue (short_2h) despite
> having queue_centric_limits enabled and a default walltime. If I try
> to submit a job with a walltime of 336:00:00 directly to the short_2h
> queue I get:
>
> qsub: Job exceeds queue resource limits MSG=cannot satisfy queue max
> walltime requirement
>
> It appears as though torque is not applying the walltime default until
> it has reached the execution queue, which my understanding of
> queue_centric_limits says it should apply before the routing queue.
>
>
> Any ideas?
>
>
> Quoting Michel Béland <michel.beland at rqchp.qc.ca>:
>
>> Hello,
>>
>>> I have a problem that jobs appear to be not routing to the correct
>>> queue. My set up is as follows:
>>>
>>> routing queue
>>> 2h queue
>>> 12h queue
>>> 1w queue
>>> unspecified time queue (max time 2w)
>>> guest queue (low priority)
>>>
>>> If a time is unspecified at job submission a default time of 2w
>>> (336h) is set
>>>
>>> The routing queue is setup as follows (as taken from qmgr -c  
>>> 'print server')
>>>
>>> create queue route
>>> set queue route queue_type = Route
>>> set queue route route_destinations = short_2h
>>> set queue route route_destinations += med_12h
>>> set queue route route_destinations += long_1w
>>> set queue route route_destinations += unspec
>>> set queue route route_destinations += guest
>>> set queue route enabled = True
>>> set queue route started = True
>>>
>>> my problem is that some jobs with unspecified time (which have
>>> correctly been given a time of 336h) are ending up in the short_2h
>>> queue, which has a higher priority than other queues. Does anyone  
>>> know
>>> of any possible explanation for this?
>>
>> Here is what you can read in Torque Admin Manual:
>>
>> "The time of enforcement of server and queue defaults is important in
>> this example. TORQUE applies server and queue defaults differently in
>> job centric and queue centric modes. For job centric mode, TORQUE  
>> waits
>> to apply the server and queue defaults until the job is assigned to  
>> its
>> final execution queue. For queue centric mode, it enforces server
>> defaults before it is placed in the routing queue. In either mode,  
>> queue
>> defaults override the server defaults. TORQUE defaults to job centric
>> mode. To set queue centric mode, set queue_centric_limits, as in what
>> follows:
>>
>> qmgr
>>
>> set server queue_centric_limits = true"
>>
>> I think that it should work. Another way would be to define the
>> route_destinations the other way around, making sure to have
>> resources_min and resources_max for all execution queues. If unspec  
>> is
>> first, job with unspecified resource limits will go there first,
>> regardless of the queue_centric_limits setting.
>>
>> Yet another way to make this work is to make sure that every job  
>> has a
>> walltime limit. At our site, the default walltime limit is 0, so  
>> people
>> have to specify it explicitly. You can however make sure that the  
>> limit
>> is present by using a submit filter that adds a walltime limit to the
>> script if it is not present.
>>
>> Hope this helps,
>>
>> --
>> Michel Béland, analyste en calcul scientifique
>> michel.beland at rqchp.qc.ca
>> bureau S-250, pavillon Roger-Gaudry (principal), Université de  
>> Montréal
>> téléphone   : 514 343-6111 poste 3892     télécopieur : 514 343-2155
>> RQCHP (Réseau québécois de calcul de haute performance)  www.rqchp.qc.ca
>> _______________________________________________
>> mauiusers mailing list
>> mauiusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list