Bugzilla – Bug 16
Enhancement of the main loop
Last modified: 2010-11-18 08:21:20 MST
You need to log in before you can comment on or make changes to this bug.
I'm currently working on a scheduler based upon the FIFO. It's final purpose is to run on a M:N architecture (many servers connected to many schedulers). While implementing I run into a problem with the pbs_sched.c main loop. Currently the scheduler is processing commands one by one. The problem is that the server is capable of generating a lot of commands in a very short time period, while the scheduler takes considerable time running one scheduling cycle. The result is that while the scheduler has done processing all changes on the server in the first two loops, there can be 10 or even 100 more commands waiting. The change I implemented is that instead of running the scheduling cycle for each command, all connections are accepted and all commands are fetched beforehand. After this is done, the commands are processed (either the old way, by running for each distinct [duplicates are ignored] command the scheduling process, or passing the commands as a set). This improves response times significantly. The original implementation had a very bad habit of starving servers (when more then one servers was connected to one scheduler). Main changes are in pbs_sched.c Support for new scheduler invocation for FIFO scheduler included in the patch.
Created an attachment (id=10) [details] patch
thanks for the patch. Can you bring this topic up on the torque-dev mailing list? I have a bunch of other patches to get in, and don't really have a lot of time to check this out out. It sounds good, but we should make sure it won't have any unintended consequences for other users. If other developers like the idea, we can get it committed to subversion. I'd also like to thank you for all your recent contributions to TORQUE. We need more community involvement. Maybe in the future, if you keep involved, you can get svn commit access.
(In reply to comment #2) > thanks for the patch. Can you bring this topic up on the torque-dev mailing > list? I have a bunch of other patches to get in, and don't really have a lot > of time to check this out out. It sounds good, but we should make sure it > won't have any unintended consequences for other users. > > If other developers like the idea, we can get it committed to subversion. It's already there :-)
no one commented on torquedev? want to bump the thread? I'd like to see what the other developers think.
(In reply to comment #4) > no one commented on torquedev? want to bump the thread? I'd like to see what > the other developers think. I will create a second part of the patch soon (push the actual scheduling into a separate thread). That should draw some attention. There are still some minor problems I need to solve, or to be more precise, make sure that they work as I think they should.
This patch is no longer maintained on my part since the actual speed increase wasn't as great as expected because server speed isn't a concern (for the most part).