Bug 16 - Enhancement of the main loop
: Enhancement of the main loop
Status: RESOLVED WONTFIX
Product: TORQUE
pbs_sched
: 2.4.x
: PC Linux
: P5 enhancement
Assigned To: Glen
:
:
:
  Show dependency treegraph
 
Reported: 2009-08-04 09:29 MDT by Simon Toth
Modified: 2010-11-18 08:21 MST (History)
0 users (show)

See Also:


Attachments
patch (17.47 KB, patch)
2009-08-04 09:29 MDT, Simon Toth
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description Simon Toth 2009-08-04 09:29:33 MDT
I'm currently working on a scheduler based upon the FIFO. It's final purpose is
to run on a M:N architecture (many servers connected to many schedulers).

While implementing I run into a problem with the pbs_sched.c main loop.
Currently the scheduler is processing commands one by one. The problem is that
the server is capable of generating a lot of commands in a very short time
period, while the scheduler takes considerable time running one scheduling
cycle.

The result is that while the scheduler has done processing all changes on the
server in the first two loops, there can be 10 or even 100 more commands
waiting.

The change I implemented is that instead of running the scheduling cycle for
each command, all connections are accepted and all commands are fetched
beforehand.

After this is done, the commands are processed (either the old way, by running
for each distinct [duplicates are ignored] command the scheduling process, or
passing the commands as a set). This improves response times significantly.

The original implementation had a very bad habit of starving servers (when more
then one servers was connected to one scheduler).

Main changes are in pbs_sched.c

Support for new scheduler invocation for FIFO scheduler included in the patch.
Comment 1 Simon Toth 2009-08-04 09:29:59 MDT
Created an attachment (id=10) [details]
patch
Comment 2 Glen 2009-08-05 18:43:01 MDT
thanks for the patch.  Can you bring this topic up on the torque-dev mailing
list?  I have a bunch of other patches to get in, and don't really have a lot
of time to check this out out.  It sounds good, but we should make sure it
won't have any unintended consequences for other users.

If other developers like the idea, we can get it committed to subversion.


I'd also like to thank you for all your recent contributions to TORQUE.  We
need more community involvement.  Maybe in the future, if you keep involved,
you can get svn commit access.
Comment 3 Simon Toth 2009-08-06 02:12:40 MDT
(In reply to comment #2)
> thanks for the patch.  Can you bring this topic up on the torque-dev mailing
> list?  I have a bunch of other patches to get in, and don't really have a lot
> of time to check this out out.  It sounds good, but we should make sure it
> won't have any unintended consequences for other users.
> 
> If other developers like the idea, we can get it committed to subversion.

It's already there :-)
Comment 4 Glen 2009-08-19 20:08:43 MDT
no one commented on torquedev?  want to bump the thread? I'd like to see what
the other developers think.
Comment 5 Simon Toth 2009-08-20 04:10:24 MDT
(In reply to comment #4)
> no one commented on torquedev?  want to bump the thread? I'd like to see what
> the other developers think.

I will create a second part of the patch soon (push the actual scheduling into
a separate thread).

That should draw some attention.

There are still some minor problems I need to solve, or to be more precise,
make sure that they work as I think they should.
Comment 6 Simon Toth 2010-11-18 08:21:20 MST
This patch is no longer maintained on my part since the actual speed increase
wasn't as great as expected because server speed isn't a concern (for the most
part).