BANGALORE, INDIA: Enterprises implementing
grid computing have reported dramatic improvements in performance and scalability for certain category of applications.
However, the extent of improvement in performance of an application to be migrated to grid is not apparent before the code re-write actually takes place. Hence, a systematic prior analysis of the benefits of deploying a legacy application on to a grid can be very useful. This note investigates the issue describing a framework to do the same.
Introduction
Grid computing is evolving as a very useful technology in several domains. Several tools and methods have been developed to develop new applications that can directly run over grid. However, a systematic mechanism to migrate older/legacy applications (that were developed for single processor systems) to run over cluster/grid is not yet available.
Before applying one such mechanism, one needs to analyze whether performance and hence business value gained through grid-enabling the application is worth the investment. Enterprise applications typically consist of interconnected components or modules of code and data that work in coordination with each other.
When the application is migrated to a grid environment, it needs to be analyzed whether or not the different interconnected components can be broken up and distributed on to various nodes of the grid. In this note, we present a framework to systematically analyze legacy code modules for gridizability and performance.
The framework takes legacy code as input and spits out suggested modules that could be re-factored for grid environment. We call it Grid Application Migration Framework (GAMF).
Grid Application Migration Framework
The framework essentially consists of four parts. These are (i) Grid Code Analyzer, (ii) DAG Reducer, (iii) Cluster Generator, and (iv) Grid Simulator.
GCA is very much like a compiler generating intermediate code that can be represented in the form of a Directed Acyclic Graph (DAG) (Note 1). The architectural details of GCA can be seen in Note 2.
Since the number of nodes in a DAG is quite large, it is computationally infeasible to analyze such a big graph further. To manage the task graph we propose a novel DAG reduction algorithm to reduce the total number of nodes in DAG (Note 3).