[torqueusers] Scheduler efficiency
franc.carter at gmail.com
Thu Jun 8 20:22:34 MDT 2006
We are using torque-1.2 with a site specific TCL scheduling algorithm. The
of jobs in the queue has grown significantly since we implemented (several
and the scheduler takes a long time to make a decision and uses lots of CPU
Part of the problem appears to be that on every cycle the scheduler needs to
completely reread the entire state instead of being able to find out just
change that caused the scheduler to be invoked - i.e job 1234 exited.
I had a look through the source code and it looks like this information is
not available in the protocol - but my C is rather rusty.
Can someone confirm that this information is not available to the scheduler,
it available in the 2.0 version. More importantly is anyone running a
works 'efficiently' in the 1000's of jobs range.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers