Case Study 6

A.21  Case Study: Collaboration & Economic Development Grids

Overview

A consortium of government, commercial and academic organizations partner to form a shared collaboration and economic development grid. A resulting organization known as the Center for Development of Advanced Computing of India states the following vision: "To emerge as the premier R&D Institution for the design, development and deployment of world class IT solutions for economic and human advancement." This particular organization states that this will result in India's largest grid in terms of computational power and availability. (See also the Cluster Ohio Project - 23 participating organizations across Ohio)


Resources

The organization has eight initial clusters that will be made available in the grid, then this will grow by almost 10 fold in the next few years. Resources will initially be accessed from 17 cities and approximately 40 to 60 organization. Operating systems vary from AIX and Solaris to various Linux distributions. Resource managers span the range of commercial products such as LoadLeveler to open source tools such as TORQUE and OpenPBS. Similarly hardware characteristics are highly heterogeneous from cluster to cluster.

Workload

As the collaboration and economic development grid has constantly evolving relationships with new consuming and hosting organizations, the workload is very unpredictable in terms of size, duration, topic, purpose and priority. Workload dependencies and optimizations will require a fine degree of intelligence, self learning and tuneability.

Solution

Moab's Grid Suite allows the organization to unify a global view of the separate resources for planning and management purposes. Further, Moab's broad heterogeneity provides an important foundation that allows participating partners to innovate in the area of their own systems to meet their own needs, without having to agree upon and unify resource managers, networks, architectures or other such aspects. The Web-based Moab Access Portal for Grids can be used for a unified submission method, while simultaneously local experienced users can continue to use resource manager commands on their own cluster which they have invested their time learning. Some shared rules can be establish for the entire system, while maintaining additional sovereign rules for organizations that seek to guide the use of the resources they purchased. Moab is able to dynamically adjust to the changing workload and apply optimization intelligence effectively in this highly complex environment.

Connection into the collaboration/economic grid using Moab does not limit a participating organization's ability to form collaborative relationships with other organizations that do not participate in the original grid. Using Moab, the individual site can create an unlimited number of associated grid relationships with other individual sites or with other grids. Ultimately Moab has a nearly boundless set of relationships and rule sets that it can apply to allow the organizations to make their own political and partnership decisions and the technology is able to match to their desired relationship. Grids are not open doors to all resources with Moab, rather using Moab allows organizations to put specific limitations on what is used, by whom, at what time and under which conditions. An individual department in an organization can have a relationship with another department of another organization while the parent organizations at a higher level do not. Different rules can apply to each grid relationship allowing for a custom association that ensures all of the security, resource availability, local prioritization, network consideration, timing and other concerns are fully met as well as optimized. Moab allows for cluster to grid relationships, grid to grid relationships, grid within grid relationships and many other relationship combinations. Establish a peer-to-peer grid across all internal clusters allowing automatic load-balancing across active clusters. Enable per lab submission points which are able to migrate workload to local, partner, or commercial resources. By default, allow only priority or urgent workload to flow to external resources. Enable automated workload roll-over and resubmission in the event of internal network or cluster failures. Provide admin notification prior to rollover to allow manual override of rollover. Allow manual reconfiguration of external resource access rights to allow production use of external resources in the event of extended internal failures or excessive workload.

Enable service level agreements within local, partner, and commercial resources to enable next-to-run, and automated preemption based on workload priority. Allow workload to be re-directed automatically as local workload levels drop or local systems are brought back online. This solution will allow users to see and utilize all potential compute resources as if they were local, even using local portals and graphical interfaces, even in the event of major local and remote failures.


Home Up Previous Next