[torquedev] Trunk And Multithreading
glen.beane at gmail.com
Sat Dec 11 12:18:49 MST 2010
2010/12/11 "Mgr. Šimon Tóth" <SimonT at mail.muni.cz>:
>> Thank you for your comments. We understand the words of caution from you
>> and Simon. We are also conscious of the fact that we are moving forward
>> with a lot of changes and not always getting input from the community.
>> There is a bit of urgency to get TORQUE to the point it can scale. SLURM
>> is moving forward with their scaling ability in response to users who
>> are currently creating clusters with 10,000 plus nodes and multiple cores.
> Hmm, the problem I see is that making Torque suitable for 10k+ clusters
> doesn't make much sense when you don't have a scheduler capable working
> with such large systems.
> Correct me if I'm wrong, but the only schedulers I know about, that are
> capable of managing such sized clusters and grid middleware schedulers,
> which don't need the clusters to be managed by a single batch server.
> Wouldn't it make more sense to make pbs_sched usable first?
I think you are the only person using pbs_sched for a non-trivial
setup. Everyone else uses Maui (which is free) or Moab. Moab is
capable of managing clusters of that size, I don't know how well Maui
scales, but I'm sure people are using it for 1k+ nodes.
>> We are hearing of plans to build systems with over 100,000 nodes and
>> right now TORQUE cannot manage such a system. I have published on this
>> list and at SC'10 what we plan for TORQUE 4.0 (3.1 in my SC'10
>> presentation). We are 1)making TORQUE mulit-threaded, 2)we are adding a
>> hierarchical job launch and 3)we will be changing the way Server-to-MOM
>> and MOM-to-MOM communication works. Any and all ideas about how to
>> improve these are welcomed and encouraged.
> Is the presentation available somewhere?
> We are solving this problem from the opposite direction, we are
> extending the distributed system support in Torque. Basically the idea
> is to split the grid into smaller sites that still externally behave as
> a single server. Our current system scales easily over tenths of sites
> (with each site handling whatever one server can handle). We are
> currently researching and designing a system that will scale over
> thousands of sites.
> As for the threading support, well the problem is that this is like the
> last change I would consider implementing. The code base is a mess and
> introducing threads into the code will make further modifications
> extremely hard.
I agree the code is a mess, and adding more #ifdef'd out features
makes it even worse
> It's already very hard to implement new features since there are many
> implied protocols in the server and mom. I bumped into many problems in
> both server and mom simply by increasing the amount of handling code
> (which caused a delay and a cascade failure).
> I still have to look through the threading code in more depth, but I
> certainly don't like what I have seen until now. You are wrapping
> threads around old not thread safe code. What should have been done
> instead is the creation of thread safe versions of the old functions.
>> We have chosen to put this work into trunk with the knowledge that it
>> will create instability. But we also have confidence that we will be
>> able to address the stability problems as the new version is deployed.
>> In the mean time we have the 2.4, 2.5 and 3.0 branches which are
>> available for use. 2.5 and 3.0 can also be improved with minor feature
>> changes as well.
> It's more of a problem with following the changes, if there would be
> feature branch I could just check the diff, now its mixed with all the
> other stuff :-/
this is a valid point, a branch that isolates the threading changes
makes it easier to see which changes are specifically to address
> Mgr. Šimon Tóth
> torquedev mailing list
> torquedev at supercluster.org
More information about the torquedev