[Mauiusers] [torqueusers] Jobs going into incorrect queue

Michel Béland michel.beland at rqchp.qc.ca
Wed Apr 22 12:31:30 MDT 2009


Hello,

> I have a problem that jobs appear to be not routing to the correct  
> queue. My set up is as follows:
> 
> routing queue
> 2h queue
> 12h queue
> 1w queue
> unspecified time queue (max time 2w)
> guest queue (low priority)
> 
> If a time is unspecified at job submission a default time of 2w (336h) is set
> 
> The routing queue is setup as follows (as taken from qmgr -c 'print server')
> 
> create queue route
> set queue route queue_type = Route
> set queue route route_destinations = short_2h
> set queue route route_destinations += med_12h
> set queue route route_destinations += long_1w
> set queue route route_destinations += unspec
> set queue route route_destinations += guest
> set queue route enabled = True
> set queue route started = True
> 
> my problem is that some jobs with unspecified time (which have  
> correctly been given a time of 336h) are ending up in the short_2h  
> queue, which has a higher priority than other queues. Does anyone know  
> of any possible explanation for this?

Here is what you can read in Torque Admin Manual:

"The time of enforcement of server and queue defaults is important in
this example. TORQUE applies server and queue defaults differently in
job centric and queue centric modes. For job centric mode, TORQUE waits
to apply the server and queue defaults until the job is assigned to its
final execution queue. For queue centric mode, it enforces server
defaults before it is placed in the routing queue. In either mode, queue
defaults override the server defaults. TORQUE defaults to job centric
mode. To set queue centric mode, set queue_centric_limits, as in what
follows:

qmgr

set server queue_centric_limits = true"

I think that it should work. Another way would be to define the
route_destinations the other way around, making sure to have
resources_min and resources_max for all execution queues. If unspec is
first, job with unspecified resource limits will go there first,
regardless of the queue_centric_limits setting.

Yet another way to make this work is to make sure that every job has a
walltime limit. At our site, the default walltime limit is 0, so people
have to specify it explicitly. You can however make sure that the limit
is present by using a submit filter that adds a walltime limit to the
script if it is not present.

Hope this helps,

-- 
Michel Béland, analyste en calcul scientifique
michel.beland at rqchp.qc.ca
bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal
téléphone   : 514 343-6111 poste 3892     télécopieur : 514 343-2155
RQCHP (Réseau québécois de calcul de haute performance)  www.rqchp.qc.ca


More information about the mauiusers mailing list