Thursday, March 3, 2016

Java SE: ThreadLocal Variables

Q: What exactly does ThreadLocal mean?
A: Oh, this is a thread, which is local, well... a local thread, a light-weight one.
JVM can launch the thread very quickly and take less amount of memory...

by @antonarhipov

It is strange reality, but not every Java developer is aware of ThreadLocal variables but the topic is very important for concurrent programming. In the article I'm going to tell a short story about ThreadLocal and share some examples of leveraging the technique.


Problem statement


We have to develop a multithreading application for building some reports, for example. Each thread of the application builds a data structure utilizing a builder class. The problem is to calculate how many records in the data structure are built by each thread.


Solution #1 - without any synchronization


Warning! This is a totally wrong solution.


The result:

Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
My name is Thread-1 and I've built 2 records
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
My name is Thread-0 and I've built 8 records


Here are two threads in the SomeBuilderThread inner class, and the builder variable, which is an instance of the SomeBuilder class, is shared between each thread. The building logic is encapsulated in the SomeBuilder#build method (for some simplification the logic is implemented as a one line of code, the line just writes a message onto the console). The counter is being incremented in the method. The counter is a field in the SomeBuilder class, and, because the class instance is shared between the threads, the counter is shared too. Access to the counter is not synchronized, so here is an obvious bug: only one line for Thread-1 could be seen in the result section as well as 7 lines for Thread-0, but the application has reported about 2 and 8 respectively.

Solution #2 - Hashtable


Let's fix the bug. The counter variable mustn't be shared between the threads and access to the variable has to be isolated. Each thread should has it's own instance of the counter variable. The Hashtable class might be taken into account, a thread name will be considered as an index.



Now, the result exactly is the same as expected:

Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
My name is Thread-1 and I've built 3 records
My name is Thread-0 and I've built 4 records


Here are 3 records built by Thread-1 and 4 records built by Thread-0.

The above solution works fine, but the standard library contains a class could take the work. As you probably have guessed, the class is referred to as ThreadLocal.

Solution #3 - ThreadLocal variables


Any ThreadLocal variable isolates only references on some objects not the objects by itself! There will be a collision if the thread held references to the same object.


The result:

Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
Thread Thread-0 building a structure...
Thread Thread-1 building a structure...
Thread Thread-0 building a structure...
Thread Thread-1 building a structure...
Thread Thread-1 building a structure...
My name is Thread-0 and I've built 4 records
Thread Thread-1 building a structure...
Thread Thread-1 building a structure...
My name is Thread-1 and I've built 7 records


Here are 7 records built by Thread-1 and 4 records built by Thread-0.

A ThreadLocal variable is utilizing here instead of the HashTable class. The following lines of code are responsible for the variable initialization:


Because ThreadLocal variables are isolated in threads, the initialization of a variable has to be put in the thread, the variable will be used in. The #set method mustn't be invoked from the main application thread, since a value assigned in the #set method will be held for the main thread, and null will be returned during an invocation of the #get method.

The ThreadLocal class contains the protected #initialValue method for initialization simplification. If the variable has no value for the current thread, it is first initialized to the value returned by an invocation of the initialValue method. ThreadLocal must be subclassed, and this method overridden. Typically, an anonymous inner class will be used. Warning! The method is used in a multithreading environment, so if some common logic is present here, the method must be implemented as thread-safe.

Well, the SomeBuilder class could be rewritten in the following way:


Using a thread-local object is a good way to avoid synchronization bottlenecks, because only one thread can ever use that object.

How does it work?


There is a table of ThreadLocal variables in each instance of the Thread class. The table uses ThreadLocal objects as keys and objects held by the corresponding ThreadLocal variables as values.


For example, let's declare the following variable:

ThreadLocal<Object> locals = new ThreadLocal<Object>();

and assign a value to the variable:

locals.set(myObject)

A reference to the locals object will be used as a key, and a reference to the myObject object will be a value. Another thread has another instance of the table, so it's own value corresponding the locals key will be assigned.

What about InheritableThreadLocal?


Some time there is a problem to pass some data through a number of application layers, e.g. transactional context, user id, HTTP session and so on. As we see a ThreadLocal variable may be used here. But what about passing data in a set of child threads? For example a servlet could get a large user request and send the request and associated user id and other data to a number of launched threads for processing. Since each thread has it's own instance of a ThreadLocal variable, the processing threads are seeing no data in the variable instance held by the parent. Java SE provides a solution: an InheritableThreadLocal variable.

An InheritableThreadLocal variable works the similar way as a usual ThreadLocal one, but a value in the corresponding map inside a Thread object is initialized by parent's. The InheritableThreadLocal class contains the protected #childValue method, computes the child's initial value as a function of the parent's value at the time the child thread is created. The sentence about the time is very important, so if parent's value is not initialized while the child thread is created, the child thread will hold nothing. The InheritableThreadLocal#set method or the #get method coupled with the overridden #initialValue one should be invoked before child thread starting.

Let's modify the above example, any built by SomeBuilder structures should be validated in other threads. The #validate method is added into the SomeBuilder class and the method is just getting a value from the counterHolder field. Since validation performed by a child thread, the field type has been changed to InheritableThreadLocal. The field value is initialized during an invocation of the #get method because an overridden #initialValue method is contained by the class.


The result:

Thread Thread-6 building a structure...
Thread Thread-5 building a structure...
Thread Thread-6 building a structure...
Thread Thread-5 building a structure...
Thread Thread-6 building a structure...
My name is Thread-5 and I've built 2 records, let me validate them in another thread...
Child thread Thread-7 are validating the built by the parent structures, my parent built 2 structures
Thread Thread-6 building a structure...
My name is Thread-6 and I've built 4 records, let me validate them in another thread...
Child thread Thread-8 are validating the built by the parent structures, my parent built 4 structures


Here are 4 records built by Thread-6, validated by Thread-8, and 2 records built by Thread-5, validated by Thread-7. The child threads have access to parent's ThreadLocal variable values.

Concrete examples


JDBC

Let's have a look at the following concrete example of leveraging ThreadLocal variables. Since JDBC Connection is not a thread-safe class, each thread interacting with a database has to have it's own instance of connection. The instance will be held by a ThreadLocal variable.


ThreadLocalRandom

It is quite expensive to initialize a Random object, and continually creating instances of that class, so Java 7 introduced the ThreadLocalRandom class.

SimpleDateFormat

The above consideration is applicable for the SimpleDateFormat class. It is quite expensive to initialize a SimpleDateFormat object too, but the object is not thread-safe, so the idea is hold an object instance in a ThreadLocal variable.



Would you like to give a 'Like'? Please follow me on Twitter!