[torqueusers] [Mauiusers] Jobs going into incorrect queue

Philip Peartree P.Peartree at postgrad.manchester.ac.uk
Wed Apr 22 16:47:19 MDT 2009


I have tested my system out, and it appears that if the time is not  
specified, then it gets routed to the first queue (short_2h) despite  
having queue_centric_limits enabled and a default walltime. If I try  
to submit a job with a walltime of 336:00:00 directly to the short_2h  
queue I get:

qsub: Job exceeds queue resource limits MSG=cannot satisfy queue max  
walltime requirement

It appears as though torque is not applying the walltime default until  
it has reached the execution queue, which my understanding of  
queue_centric_limits says it should apply before the routing queue.


Any ideas?


Quoting Michel Béland <michel.beland at rqchp.qc.ca>:

> Hello,
>
>> I have a problem that jobs appear to be not routing to the correct
>> queue. My set up is as follows:
>>
>> routing queue
>> 2h queue
>> 12h queue
>> 1w queue
>> unspecified time queue (max time 2w)
>> guest queue (low priority)
>>
>> If a time is unspecified at job submission a default time of 2w  
>> (336h) is set
>>
>> The routing queue is setup as follows (as taken from qmgr -c 'print server')
>>
>> create queue route
>> set queue route queue_type = Route
>> set queue route route_destinations = short_2h
>> set queue route route_destinations += med_12h
>> set queue route route_destinations += long_1w
>> set queue route route_destinations += unspec
>> set queue route route_destinations += guest
>> set queue route enabled = True
>> set queue route started = True
>>
>> my problem is that some jobs with unspecified time (which have
>> correctly been given a time of 336h) are ending up in the short_2h
>> queue, which has a higher priority than other queues. Does anyone know
>> of any possible explanation for this?
>
> Here is what you can read in Torque Admin Manual:
>
> "The time of enforcement of server and queue defaults is important in
> this example. TORQUE applies server and queue defaults differently in
> job centric and queue centric modes. For job centric mode, TORQUE waits
> to apply the server and queue defaults until the job is assigned to its
> final execution queue. For queue centric mode, it enforces server
> defaults before it is placed in the routing queue. In either mode, queue
> defaults override the server defaults. TORQUE defaults to job centric
> mode. To set queue centric mode, set queue_centric_limits, as in what
> follows:
>
> qmgr
>
> set server queue_centric_limits = true"
>
> I think that it should work. Another way would be to define the
> route_destinations the other way around, making sure to have
> resources_min and resources_max for all execution queues. If unspec is
> first, job with unspecified resource limits will go there first,
> regardless of the queue_centric_limits setting.
>
> Yet another way to make this work is to make sure that every job has a
> walltime limit. At our site, the default walltime limit is 0, so people
> have to specify it explicitly. You can however make sure that the limit
> is present by using a submit filter that adds a walltime limit to the
> script if it is not present.
>
> Hope this helps,
>
> --
> Michel Béland, analyste en calcul scientifique
> michel.beland at rqchp.qc.ca
> bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
> téléphone   : 514 343-6111 poste 3892     télécopieur : 514 343-2155
> RQCHP (Réseau québécois de calcul de haute performance)  www.rqchp.qc.ca
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
>




More information about the torqueusers mailing list