[torqueusers] RE: scheduling bug?
Alexander Saydakov
saydakov at yahoo-inc.com
Thu Jan 26 12:17:22 MST 2006
I have just seen that again. Looks like there is some randomness in
selecting jobs despite of 'sort_by: high_priority_first'. Most of the time
it works as expected, but sometimes picks a wrong job. I don't specify any
requirements like memory or disk space.
Note those 2 jobs running in p2 queue and one job in p3 queue. They have
just started despite of the pile of jobs in p1 queue.
> qstat -Q
Queue Max Tot Ena Str Que Run Hld Wat Trn Ext Type
---------------- --- --- --- --- --- --- --- --- --- --- ----------
test 0 0 yes yes 0 0 0 0 0 0 Execution
p1 0 102 yes yes 902 121 0 0 0 0 Execution
p2 0 607 yes yes 605 2 0 0 0 0 Execution
p3 0 610 yes yes 609 1 0 0 0 0 Execution
-----Original Message-----
From: Alexander Saydakov [mailto:saydakov at yahoo-inc.com]
Sent: Tuesday, January 24, 2006 2:24 PM
To: 'torqueusers at supercluster.org'
Subject: scheduling bug?
I am running troque-2.0.0p5 with default scheduler. The only thing I changed
in sched_config is:
#sort_by: shortest_job_first ALL
sort_by: high_priority_first ALL
I have just seen that a few jobs from the test queue (priority 0) started
despite of a bunch of jobs waiting in the p1 queue (priority 100).
All jobs in the test queue have been submitted after jobs in the p1 queue.
> qstat -Q
Queue Max Tot Ena Str Que Run Hld Wat Trn Ext Type
---------------- --- --- --- --- --- --- --- --- --- --- ----------
test 0 605 yes yes 603 2 0 0 0 0 Execution
p1 0 320 yes yes 214 106 0 0 0 0 Execution
I wonder how it is possible.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20060126/c0dba633/attachment.html
More information about the torqueusers
mailing list