Wednesday, April 27, 2016

Phantom References in Java

The Java SE platform provides the following few types of references for interconnecting among objects:
  1. Strong References - they are standard, well known references. If there is a strong reference to an object, the object isn't able to be swept out by the garbage collector.

  2. Soft References - they are created by using the constructor new SoftReference<T>(T obj, ReferenceQueue<T> queue) or new SoftReference(T obj). If there is only a soft reference to an object, the garbage collector tries to sweep out the object when an amount of the free memory becomes not enough for the application.

  3. Weak References - they are created by using the constructor new WeakReference<T>(T obj, ReferenceQueue<T> queue) or new WeakReference<T>(T obj). If there is only a weak reference to an object, the garbage collector tries to sweep out the object on the next iteration.

  4. Phantom References - they are created by using the constructor new PhantomReference<T>(T obj, ReferenceQueue<T> queue). If there is only a phantom reference to an object, the garbage collector will try to sweep out the object on the next iteration, but the object won't be removed from the memory while there is a phantom reference and the reference is not clean by using the clear() method. Also, the get() method of the reference forever returns null.

And the question, you can ask, is, why is there the kind of references remains objects in the memory but definitely unusable for getting access to any object?

What about any problems regarding the finalize() method?

The garbage collector reclaims the memory in two phases:

  1. An invocation of the finalize() method for an object.

  2. Previously allocated memory reclaiming.

That both above actions are performed in the garbage collector thread and the behavior of the garbage collector depends from the finalize() method. If the method is overridden, the object is marked on the first garbage collector iteration, then the finalize() method is executed and the piece of memory is allocated for the object is reclaims on the following garbage collector iterations only.

The following plot demonstrates the garbage collector behavior for a class without the finalize() method:

The behavior for a class contains the finalize() method:

There is some possibility to create a strong reference to an object during a finalize() method execution. For example, an object is serialized in a file during the finalization. The serializer gets a reference to the object, does his work, but still holds the reference. So, the garbage collector won't be able to reclaim the memory through the following after the finalize() invocation iterations.

This possibility is very critical for applications handle large objects in low memorylocked environments, for example for photo-processing applications running on Android based mobile phones. A new photo can be loaded only if the previous one is swept out from the memory. In the finalize() method the system is notified about unloading, but due the described above problem the object remains in the memory. The OutOfMemoryError is looming on the horizon.

How PhantomReference works

If there is only a phantom reference to an object, the garbage collector takes the following two actions:

  1. Invokes the finalize() method.

  2. If there is still no strong references to the object, the phantom reference is put into the ReferenceQueue queue.

I have to notice, if the finalize method is overridden, the above actions are performed on the different garbage collector iterations. The introduction noted that an object will remain in the memory until the phantom reference becomes clean. After the clear, the object is able to be swept out on the next garbage collector iteration. So, if the finalize() method isn't overridden for the object, two iterations are needed for garbage collecting, or three iterations otherwise.

Using PhantomReference

A good way to get a notification about a phantom reference in the queue is to create an isolated thread for polling the queue executing the ReferenceQueue#poll() method. The method returns a reference from the queue if the reference is there, or null. The best place for the code is a sub-class of PhantomReference. This sub-class also contains a method clears a system after the memory reclaiming, I mean some actions performed in the finalize() method before. The actions can be the following: connection closing, object state flushing, session invalidation and so on. It is very important: some actions should get access to the internal state of the object. It is definitely wrong to keep a reference to the object as a field in the sub-class, because the reference will be a strong reference, and the object will become not accessible for the garbage collector. The sub-class of PhantomReference should close only necessary fields, which participating in the reference clearing action. Let me summarize the strategy:

  1. Create a sub-class of the PhantomReference class. The class is encapsulated an environment clearing method.

  2. Create a thread, the ReferenceQueue queue bounded to an instance of the created on the first step class is assigned to the constructor of the thread. The run() method performs queue polling. When a not null value is gotten from ReferenceQueue, the corresponding method of the PhantomReference sub-class will be executed. The method encapsulates the environment clearing logic.

  3. The constructor of the new created sub-class encapsulates the logic of keeping any necessary fields of an assigned object into the fields of the class and then starts the created on the previous step thread.

One picture is better than thousands words:

Please, share your opinion in the comments bellow. May be you have your own point of view on the concept of phantom references, how, and, mostly important, why for use it.

Would you like to give a 'Like'? Please follow me on Twitter!