Tech

De-mystifying Grid Technologies

CIOL Bureau

23 Apr 2007 00:00 IST

Updated On 23 Apr 2007 00:25 IST

New Update

Computation has changed drastically since the days of the first computer. In the 60s and 70s, mainframes took charge of all processing and computation for government, scientific and organizational needs. Thereafter, we saw the advent of desktops or 'Micro Computers.' Almost parallelly, the concepts of networking started to develop. And it didn't take long thereafter when grids and clusters were implemented. In this article, we look into the concept of the computation extremes achieved taking clusters a step further. Yes, we are talking about the still in infancy yet very promising Computation Grid. Read d out what it is, how it works, and most importantly which way it is heading.

Advertisment

What is a Grid?
Well its name and concept is derived from the electric power grid. To put it shortly a grid is the way to share computational power and data storage over the Internet. Just like the electric grid you don't have to worry where are you receiving power from. Basically, the computational grid brings all the resources under it into one entity. This collection of resources can then be used for high end computation and with the storage of the participating systems combined, provide an infinite but cheap storage option. While some might define it as a 'collection of clusters' or other definitions, we would like to stick to the definition we gave a little while ago without giving any specific structural example.

Now let us get down to a more elaborate definition. Grid computing can best be defined as a form of distributed computing that works by sharing computing, application, data, storage, or network resources across dynamic and geographically dispersed organizations or computers. This is the reason we say that a collection of clusters is not an appropriate definition. Clusters don't work by bringing together systems or computers located geographically apart. We will get down to differences between grids and clusters in detail a little later.

Grid technologies promise to change the way organizations tackle complex computational problems. However, the vision of large scale resource sharing is not yet a reality in many areas-grid computing is an evolving area of computing, where standards and technology are still being developed to enable this new technology.

Advertisment

Need for a grid
Science has advanced by leaps and bounds and has grown more dependent on computational power for research and analysis. While a powerful machine was enough to analyze or compute whatever data, say a Pharma researcher had a decade ago; things have changed a lot. Specifically in areas such as medical research, nuclear physics, molecular studies, etc. For example, the amount of data that scientists download from satellite monitoring activities in outer layers of atmosphere goes up to approx 200 GB daily. Now you might realize the kind of giant processing power you would need to consume data recorded over say a week and perform computations on it. It has to be huge and powerful. This is one of the reasons scientists demanded a system powerful enough and with near infinite storage that could easily perform computation on the kind of data they accumulate. It is scenarios like this which lead to the need for Computational Grid. Rest as they say is history.

Grid architecture
Much like the Electric Grid from where the idea of Computational Grid came, the architecture is a layered one. Thus we have grid applications as the top most layer that might be scientific, engineering, and commercial or even web portals. The next layer is that of the grid environment and tools. This layer provides the libraries, runtime interfaces, even compilers and most importantly parallelization tools. Next comes the layer which is rather a vendor specific implementation, the Grid Middleware. This layer is in-charge of all the resource management, scheduling services, job submission, storage access, and info services across the entire grid. The middleware can further be segregated as a layer comprising two sub layers. Some conceptualize two different layers. The User-level middleware which takes care of the first two of all the tasks we mentioned for middleware. The second one, Core Grid Middleware that handles the latter four. Now since the grid will be using Internet as the communication, computation and in-fact storage infrastructure and will be communicating or connecting to clusters/grids across geographies; a Security Layer becomes indispensible. Also referred to as the Security infrastructure, this layer provides authentication and secure communication. The bottom most layer is the 'Grid Fabric' which is nothing but the existing 'network of networks' and its components, clusters running on various OS, storage devices, databases and even specific devices such as sensors.

Advertisment

Grid Architecture

Grid application
Science, engineering, commercial applications, Web portals

Grid programming environments and tools
Languages, interfaces, libraries, compliers, parallelization tools

User-level middleware–resource aggregators
Resource management and scheduling services

Core grid middleware
Job submission, storage access, info services, trading accounting

Security infrastructure
Single sign-on, authentication, secure communication

Grid fabric
PCs, workstations, clusters, networks, software, database, devices

At the heart of the Grid is what we call the broker. We can describe the working of the Grid at a rather abstract level as follows. Once a job is submitted for operation in a Grid, the broker discovers resources that the user can access through 'Grid Information Servers.' It then negotiates with grid-enabled resources or their 'Agents' using middleware or middleware services, maps these to the resources (also known as scheduling in Grid context) and then stages the data for processing or application to be run. This last step is referred to as 'Deployment' in Grid context. The broker finally collects results. It monitors the application's execution progress also. It also takes care of changes in the Grid structure and resource failures.

Advertisment

Grid Vs Cluster computing

Grid Vs Cluster computing

Advertisment

In a grid environment, we have a loosely coupled architecture of systems connected majorly over a Wide Area Network or an Internet. The job is more or less the same as is done by a Computational Cluster, which is to harness resources of multiple ideal machines. But in case of a Grid it's not necessary that it will only leverage the processing power of all the machines. You can instead create a Data Grid which actually creates and manages distributed data storage and is also called a Grid.

The other key feature of a Grid which actually differentiates it from a Cluster is its de-centralized model, where you generally don't have a controller in place and each and every node works independently. In this case the nodes can also be heterogeneous in terms of Operating Systems hardware architecture.

One example of grid computing is the infamous SETI@home project to search for extraterrestrial intelligence. There is a centralized telescope which captures radio signals from space and then transfers the data captured in small packets to several million computers connected to the Internet. The nodes then process these packets of data in their idle time and return the results back to a data center. This way high processing power is
obtained, utilizing the idle time of several computers spanning across the globe.

Advertisment

In this example you can clearly see that the architecture is completely de-centralized and loosely coupled. And is also very highly heterogeneous because over the Internet one can't control which OS or architecture will a node be using.

Clusters on the other hand use a single server or controller to manage and distribute/aggregate the processes and one or more client nodes connected via a tightly coupled environment such as a high speed LAN or some specialized high speed interconnect such as Myranet, etc. But, unlike grid computing, where each client computer can run its own OS, this one is controlled and managed by a single OS running across the computers in the cluster, making it highly homogeneous in nature. The server provides various files to clients for execution. Applications are run on clients using parallel processing algorithms.

The clients are just dumb terminals, with no display in most of the cases or input devices connected to them. The server is the single interface for the entire system, where all input and output takes place. To the user the entire setup appears as a single system. These formations of clusters are commonly known as SSI or Single System Image.

Advertisment

Beowulf clusters, which are built from commodity 'off the shelf' computer parts running free OSes like Linux, are an example of such a kind of cluster. They provide very cost-effective parallel processing.

Grid in the Enterprise

Grid in the Enterprise

Enterprises have their own complex applications and huge repositories of data which also require high if not mammoth (as is the case with scientific data) computational power to analyze. And not surprisingly, vendors like Sun Microsystems, Oracle, Fujitsu, and Informatica as well as others have started utilizing and implementing grid based solutions to tackle diverse issues. For example, Sun and Informatica are providing grid computing based solutions for data centric needs of organizations. They also provide data integration using a grid. By using a grid for data centric needs brings with it major advantages such as high availability, automatic recovery, adaptive load balancing where-in load balancing works on the basis of situation at hand, and also sessions on Grid. Similarly Oracle's grid implementations cover a wide range of services for the enterprise.

The most interesting one from these is the grid solution for SOA runtime governance and SOA infrastructure monitoring. Now this is really interesting because as you would know and as we have gone on record saying that SOA implementations more often than not bring together a variety of systems, components, and applications under one roof. Implementing a grid control for SOA runtime governance would make runtime recording of service requests, monitoring the complex process flows and similar tasks easier and more manageable due to the high grade computation power that grid provides. Other than this their grid solution also supports identity management, and the other wise cumbersome task of application server cluster deployment.

With the grid making a steady progress into enterprises, for primarily smoothing out management or deployment of very large implementations, these technologies can surely address a lot more pain areas if carefully matured over time. After all, who would not want their processes, analytics or even data needs to be not limited by computational power or storage considerations.

Emerging trends

Emerging trends

Let's now consider some of the latest trends in this sphere.

Between Nov 97 and Feb 06, PrimeNet Grid has handled 11,579,649,914 P90 machine-hours. Its throughput rate can be characterized by a fitted, exponential trend line
Although, there is no full-fledged application available which can leverage such a concept, you can use an application called GPU (downloadable from http://gpu.sf. net). This application is still an alpha and can only run some test applications such as Image Rendering, Net Crawling, etc.

But imagine what will happen when this technology matures. Any one with a machine and an Internet connection can become a part of a public Grid and share processing power the same way as we share MP3 and music files today. So, in that case we will truly be able to achieve Internet computing or rather Internet Super Computing.

Grid management: You must have heard about many types of grids and clusters and read about them in PCQuest, such as heterogeneous Grid Platform called Condor, or Globus or some simple clustering middleware such as SSI-based like OpenMosix and MPI-based ones like Oscar and Flash Mob etc. If you search over the Net, you will find there are quite a few different kinds of grid products available. Some have a graphical front end to monitor the nodes and some even don't have one. Let's take a classic example, OpenMosix.

In a matter of 5 mins, we were able to connect to a P2P grid with 10 GB of RAM and 10 GFlops of processing power using GPU
This one has a graphical monitoring application called OpenMosixView, but have you ever noticed that if the number grows to something around a hundred nodes, then how difficult it becomes to monitor? Plus, it only shows you the current RAM and CPU utilization of the nodes. What about the disk usage? Or if in case, you want to see what the CPU utilization was in the last one hour or day, then?

These are things which are very difficult to monitor in case of large grids or clusters. To make things worse, let's say you have multiple grids, one based on Condor and the other one on Globas. Another one could just be a cluster using Oscar or ROCKS with MPI support. And you want to monitor both of them from one place. Then, what will you do?

Let's take a case of a cluster or a grid with hundreds and thousands of nodes over a wide geographical distribution. Managing them all from one place can be really difficult. So, this is one area that is picking up on the Grid technology front. The most common and popular tool out there which solves this purpose is Ganglia. We have talked about this in detail in our June 2006 issue. And this is the one being used by most of the biggies using Grid technologies such as NASA, CRAY, SUN, Boeing, US Air Force and Microsoft.

Glossary

Cluster Interconnect: A very high speed connection allowing computers in a cluster to interconnect. Enterprise Grid Alliance: A vendor-neutral, open and independent organization that works as a consortium for focusing on obstacles enterprises face in grid implementations, and promoting open and interoperable solutions for problems.
Enterprise Grid: A collection of networked components including systems, applications like CRM, ERP etc. usually managed by a distinct business entity providing a set of services and assignment of resources to these services for accomplishing business goals.

N1: Sun's architecture for next-generation data-center that makes the entire data center work as one single, unified system. It reduces management and costs, increases the data-center resource utilization, infrastructure responsiveness, and agility.

Utility Computing: 'PAY-AS-YOU-GO' model of computing analogous electricity usage. Instead of paying for computing resources to handle peak load all the time it requires paying only for the computing used.

Utility Data-Center: An infrastructure solution proposed by HP that allows virtualization of computing resources for the data center. The Utility Data Center includes servers, storage, and networking products that are integrated and deployed by intelligent management software that allows them to be shared and dynamically re-provisioned to accommodate changing workloads.

tech-news