Tech

Scaling Applications across Multiple JVMs

CIOL Bureau

28 Mar 2007 00:00 IST

Updated On 28 Mar 2007 22:26 IST

New Update

Anandi Misra

Advertisment

Clusters and grids remain the inevitable choice when you have to battle out issues of scalability, high availability, high performance or even fail-over. It is not that this has not been adopted for Java applications, the clustering or grid is more of API based. Thus, you can cluster application servers but not the objects or more importantly the state of these objects when you plan to expand your application over a cluster or grid. In this series, we look into a clustering solution called Terracotta that does clustering at the JVM level.

The Terracotta solutions allow clustering without changing the existing code of your application. It uses a declarative pattern using which you can configure your application for JVM level clustering. This part looks into Terracotta DSO (Distributed Shared Objects), the core technology for clustering JVMs and the basics concepts related to the clustering.

Direct Hit!

Applies To: Java developers
USP: Using Terracotta for distributing applications across multi-JVMs
Primary Link: www.terracotta-tech.com
Google Keywords: JVM clustering, POJO grid

Advertisment

Terracotta basics
We talked about clustering of application servers a while ago and weren't all praises for it. Let us examine why. We used a commercially available Application Server and created a cluster on 10 workstations, each running its own JVM for deploying a Web app to test how efficient this clustering is in comparison to using an application in a single application server on a server machine. We could configure what modules of the sample application will be installed on either a part of this cluster or throughout the cluster, or even choose a few nodes for installing the application and assign fail-over nodes. So far so good, but there lie a few differences in this clustering and the one Terracotta provides.

For one, you have multiple application servers running as one unit, but is there any in-built mechanism provided for state management or concurrency? By in-built, we mean whether these things can be managed without topping up your application with additional APIs. That is, it doesn't necessarily ensure that an object's state in (say) the heap of a node is known globally. If needed, it has to be taken care of by developers. Two, it is certainly not a cluster of the objects in our application. For overheads, you would certainly know the dreaded 'S' (Serialization) that comes into play in Java EE clustering.

Advertisment

This is where Terracotta is different. Clustering at JVM level means that all the JVMs act as a single unit. But this eases out a lot of burden, as once there is one single JVM acting globally, any changes in the heap of a particular JVM are replicated throughout the network. In case network overheads are the next issue on your mind, that's not a bottleneck because a change in any field of a globally shared object would require replication of that field only. We will show this a little later in the series that Terracotta does exactly that. So, instead of a cluster of application servers with there own heaps invisible to each other, we have a cluster of JVMs which gives a global virtual heap A lock becomes a cluster-wide lock; a shared object's update becomes a cluster-wide update. Most importantly, you don't have to explicitly make such things visible throughout the cluster.

Distributed Shared Objects
Distributed Shared Objects (DSO) is the core technology enabling clustering services at the JVM level. The best part is that you don't need to import any DSO libraries for clustering your application. DSO works at the byte-code level. When you declaratively configure your application for clustering, relevant changes are applied at the JVM, which Terracotta terms as hooks. These hooks, then, keep track of field updates in the instantiated objects that have been declared as 'Shared Roots'. The changes are then informed to the Terracotta Server that further replicates these changes across VMs. Configuring the application requires declaring which objects are to be shared, or replicated across VMs, which classes to instrument, ie those classes whose instances are being declared shared roots, or have distributed locks or distributed methods.

While doing all this, you just have to create appropriate configuration files for enabling your application for DSO. So, the entire exercise's critical phase is to realize what has to be declared as shared and distributed and where locks are to be applied and which kind of locks are to be applied.

Advertisment

Clustering with Terracotta
Let us look into the procedure for enabling an application for Terracotta. As we had said earlier, you do not need to import any libraries into your code for this. You rather need to work on generating the 'Terracotta-Config' file, which describes the clustering for your application. First, you have to determine which objects are to be shared. It largely depends on the characteristics of your application. But a generalized view that can work to start with, is to look for objects in your application that represent state information and form the core of your applications' logic. Be careful not to include objects that appear to be shared objects, but if scrutinized further are actually objects with a particular JVM, for eg sockets, connections, etc). You, then, mark fields in these objects to be replicated across the cluster. Terracotta follows a distinct approach when it comes to initialization of the shared objects. The shared roots are initialized only within the cluster and the very first 'not-null' reference that is initialized is assigned to the cluster. Any subsequent assignment of reference is ignored. Terracotta implements this semantic to prevent the application code from changing it in context to the cluster.

This is a reason why we said earlier that you have to exercise what has to be shared rather than getting into tearing apart your existing modules or adding code to existing application code. While shared objects are integral to clustering, so are locks as you need to maintain integrity of data members throughout the cluster. These locks are termed 'Distributed Locks' in Terracotta lingo. They help keeping data members' values synchronized cluster wide. Locks can also be 'Named' so that methods of similar names use the same lock. These are of help in methods written without any thread safety considerations.

In conclusion
So far, we have underlined the basic concepts regarding Terracotta DSO and clustering. In the upcoming parts of the series, we will look into basic examples such as how to go about configuring an application for DSO, and later on how to enable your Spring Objects for Terracotta. In fact, the product documentation lists a few simple examples such as Slider Application, which can be referred to get a good idea of how to use the Terracotta Eclipse Plugin and how to configure an application for 'Terracotting' it.

Advertisment

Source: PCQuest

tech-news