Advertisment

Smart object-management in Java

author-image
CIOL Bureau
Updated On
New Update

Excessive object creation can be a huge problem in Java programs. Despite continuing improvements in JVM technology for memory management, the object creation and garbage collection cycle is still a very expensive operation. If frequently used code creates excessive numbers of objects, performance can be slowed to a crawl as the JVM churns through, repeatedly creating and garbage collecting the objects. By understanding how to reduce the volume of object creation in your code, you can put an end to object churn and maximize the useful work you're getting out of the JVM.



Objects are a powerful software engineering construct, and Java uses them extensively. In fact, it encourages the use of objects so much that developers sometimes forget the cost behind the construct. The result can be object churn, a program state in which most of your processor time is soaked up by repeatedly creating and then garbage collecting objects.



Java Memory Management



Java's simplified memory management is one of the key features that appeal to developers with backgrounds in languages such as C/C++. In contrast with the explicit de-allocation required by C/C++, Java lets you allocate objects as necessary and trust that they'll be reclaimed and recycled by the JVM when they're no longer needed. The work required to make this happen goes on behind the scenes, in the garbage collection process.



Garbage collection has been used for memory management in programming languages dating back to the dawn of the computer era in the 1960s. The basic principle of garbage collection is the same in all cases: identify objects that are no longer in use by the program, and recycle the memory used by these objects to create new ones.



JVMs generally use a reachability analysis to identify the objects that are in use, then recycle all the other objects. This starts with a base set of variables that the program uses directly, including object references in local or argument variables on the method call stack of every active thread, and in static variables of loaded classes. Each object that one of these variables references is added to the set of reachable objects. Next, each object that a member variable references in one of these objects is also added to the reachable set. This process continues until closure is obtained; in the end, every object referenced by any object in the reachable set is also in the reachable set. Any objects that are not in the reachable set are by definition not in use by the program, so they can safely be recycled.



The developer generally doesn't need to be directly involved in this garbage collection process. Objects drop out of the reachable set and become eligible for recycling as they're replaced with other objects, or as methods return and their variables are dropped from the calling thread's stack. The JVM runs garbage collection periodically, either when it can, because the program threads are waiting for some external event, or when it needs to, because it's run out of memory for creating new objects. Despite the automatic nature of the process, it's important to understand that it's going on, because it can be a significant part of the overhead of Java programs.



Besides the time overhead of garbage collection, there's also a significant space overhead for objects in Java. The JVM adds internal information to each allocated object to help in the garbage collection process. It also adds other information required by the Java language definition, which is needed in order to implement such features as the ability to synchronize on any object. When the storage used internally by the JVM for each object is included in the size of the object, small objects may be substantially larger than their C/C++ counterparts. Table 1 shows the user-accessible content size and actual object memory size measurements for several simple objects on various JVMs, illustrating the memory overhead added by the JVMs.













































































 



Content (bytes)



JRE 1.1.8 (Sun)



JRE 1.1.8 (IBM)



JRE 1.2.2 (Classic)



JRE 1.2.2 (HotSpot 2.0 beta)



java.lang.Object



0



26



31



28



18



java.lang.Integer



4



26



31



28



26



int<0>



4 (length)



26



31



28



26



java.lang.String





(4 characters)


8 + 4



58



63



60



58



This space overhead is a per object value, so the percentage of overhead decreases with larger objects. It can lead to some unpleasant surprises when you're working with large numbers of small objects, though -- a program juggling a million Integers will have most systems down on their knees, for example!



Comparison with C/C++



For most operations, Java performance is now within a few percent of C/C++. The just-in-time (JIT) compilers included with most JVMs convert Java byte codes to native code with amazing efficiency, and in the latest generation (represented by IBM's JVM and Sun's HotSpot) they're showing the potential to start beating C/C++ performance for computational (CPU intensive) applications.



However, Java performance can suffer by comparison with C/C++ when many objects are being created and discarded. This is due to several factors, including initialization time for the added overhead information, garbage collection time, and structural differences between the languages. Table 2 shows the impact these factors can have on program performance, comparing C/C++ and Java versions of code repeatedly allocating and freeing arrays of byte values.



 



 



 





















































 



JRE 1.1.8 (Sun)



JRE 1.1.8 (IBM)



JRE 1.2.2 (Classic)



JRE 1.2.2 (HotSpot 2.0 beta)



C++



Short-term Allocations (7.5 M blocks, 331 MB)



30



22



26



14



9



Long-term Allocations (7.6 M blocks, 334 MB)



48



28



39



33



13



Table 2. Memory management performance (time in seconds)



 



For both short- and long-term allocations, the C++ program is considerably faster than the Java program running on any JVM. Short-term allocations have been one focus area for optimization in HotSpot. Results show that -- with the Server 2.0 beta used in this test -- this is the closest any JVM comes to the C++ code, with a 50 percent longer test time. For long-term allocations, the IBM JVM gives better performance than the HotSpot JVM, but both trail far behind the performance of the C++ code for this type of operation.



Even the relatively good performance of HotSpot on short-term allocations is not necessarily a cause for joy. In general, C++ programs tend to allocate short-lived objects on the stack, which would give a lower overhead than the explicit allocation and deallocation used in this test. C++ also has a big advantage in the way it allocates composite objects, using a single block of memory for the combined entity. In Java, each object needs to be allocated by its own block.



We'll certainly see more performance improvements for object allocation as vendors continue to work on their VMs. Given the above advantages, though, it seems unlikely the performance will ever match C++ in this area.



Does this mean your Java programs are eternally doomed to sluggish performance? Not at all -- object creation and recycling is just one aspect of program performance, and, providing you're sensible about creating objects in heavily used code, it's easy to avoid the object churn cycle!





tech-news