Monday, March 14, 2011

Java static initialization - part 2

This is a follow-up of Java static initialization - part 1. Part 1 went through §12.4.1 When Initialization Occurs of Java language specification.

This post will go throught the rest of 12.4 chapter starting with the section "12.4.2 Detailed Initialization Procedure". As name suggests this section provides detailed description of how a class or interface is initialized. It starts with rationale.

12.4.2 Detailed Initialization Procedure

Because the Java programming language is multithreaded, initialization of a class or interface requires careful synchronization, since some other thread may be trying to initialize the same class or interface at the same time. There is also the possibility that initialization of a class or interface may be requested recursively as part of the initialization of that class or interface; for example, a variable initializer in class A might invoke a method of an unrelated class B, which might in turn invoke a method of class A.

After explanation why this level of detail is needed it describes the procedure itself. This procedure can be rewritten in this Java pseudo-code (comments refer to procedure points):

void initializeClass(Clazz clazz) {
    boolean repeat = false;
    while (repeat) {
        repeat = false;
        synchronized (clazz) { // 1.
            InitializationStatus status = clazz.getInitializationStatus();
            switch (status) {
            case PREPARED: //6.
                clazz.setInitStatus(InitializationStatus.INITIALIZING);
                clazz.setInitThread(Thread.currentThread());
                break;
            case INITIALIZING:
                if (clazz.getInitThread() == Thread.currentThread()) { // 3.
                    return;
                } else {    // 2.
                    try {
                        clazz.wait();
                    } catch (InterruptedException e) {
                        Thread.currentThread().interrupt();
                        throw new NoClassDefFoundError();
                    }
                    repeat = true;
                }
                break;
            case INITIALIZED: // 4.
                return;
            case ERRONEOUS:  // 5.
                throw new NoClassDefFoundError();
            }
        }
    }
    try {
        Clazz superClass = clazz.getSuperclass();  // 7.
        if (superClass != null) {
            initializeClass(superClass);
        }
    } catch(Throwable t) {
        synchronized (clazz){
            clazz.setInitStatus(InitializationStatus.ERRONEOUS);
            clazz.notifyAll();
        }
        throw t;
    }
    try {
        clazz.getClassLoader().desiredAssertionStatus(clazz.getName()); // 8.
        clazz.<clinit>();  // 9.
        synchronized (clazz){  // 10.
            clazz.setInitStatus(InitializationStatus.INITIALIZED);
            clazz.notifyAll();
        }
    } catch(Throwable t) {
        if (!(t instanceof Error)){ // 11.
            try{
                t = new ExceptionInInitializerError(t);
            } catch (OutOfMemoryError oome){
                t = oome;
            }
        }
        synchronized (clazz){ // 12.
            clazz.setInitStatus(InitializationStatus.ERRONEOUS);
            clazz.notifyAll();
        }
        throw t;
    }
}

The implementation of the Java virtual machine is responsible for taking care of synchronization and recursive initialization by using the following procedure. It assumes that the Class object has already been verified and prepared, and that the Class object contains state that indicates one of four situations:

  • This Class object is verified and prepared but not initialized.
  • This Class object is being initialized by some particular thread T.
  • This Class object is fully initialized and ready for use.
  • This Class object is in an erroneous state, perhaps because initialization was attempted and failed.

The procedure for initializing a class or interface is then as follows:

  1. Synchronize (§14.19) on the Class object that represents the class or interface to be initialized. This involves waiting until the current thread can obtain the lock for that object (§17.1).
  2. If initialization is in progress for the class or interface by some other thread, then wait on this Class object (which temporarily releases the lock). When the current thread awakens from the wait, repeat this step.
  3. If initialization is in progress for the class or interface by the current thread, then this must be a recursive request for initialization. Release the lock on the Class object and complete normally.
  4. If the class or interface has already been initialized, then no further action is required. Release the lock on the Class object and complete normally.
  5. If the Class object is in an erroneous state, then initialization is not possible. Release the lock on the Class object and throw a NoClassDefFoundError.
  6. Otherwise, record the fact that initialization of the Class object is now in progress by the current thread and release the lock on the Class object.
  7. Next, if the Class object represents a class rather than an interface, and the superclass of this class has not yet been initialized, then recursively perform this entire procedure for the superclass. If necessary, verify and prepare the superclass first. If the initialization of the superclass completes abruptly because of a thrown exception, then lock this Class object, label it erroneous, notify all waiting threads, release the lock, and complete abruptly, throwing the same exception that resulted from initializing the superclass.
  8. Next, determine whether assertions are enabled (§14.10) for this class by querying its defining class loader.
  9. Next, execute either the class variable initializers and static initializers of the class, or the field initializers of the interface, in textual order, as though they were a single block, except that final class variables and fields of interfaces whose values are compile-time constants are initialized first (§8.3.2.1, §9.3.1, §13.4.9).
  10. If the execution of the initializers completes normally, then lock this Class object, label it fully initialized, notify all waiting threads, release the lock, and complete this procedure normally.
  11. Otherwise, the initializers must have completed abruptly by throwing some exception E. If the class of E is not Error or one of its subclasses, then create a new instance of the class ExceptionInInitializerError, with E as the argument, and use this object in place of E in the following step. But if a new instance of ExceptionInInitializerError cannot be created because an OutOfMemoryError occurs, then instead use an OutOfMemoryError object in place of E in the following step.
  12. Lock the Class object, label it erroneous, notify all waiting threads, release the lock, and complete this procedure abruptly with reason E or its replacement as determined in the previous step.

(Due to a flaw in some early implementations, a exception during class initialization was ignored, rather than causing an ExceptionInInitializerError as described here.)

As mentioned above this detail description allows us to reason about static initialization scenarios including circular dependencies.

Let's have 2 classes A and B:

class A {
    static int Y = B.Y;
    static int X = 10;
}

class B {
    static int X = A.X;
    static int Y = 25;
}

It seems quite straightforward what values these static fields are going to have - A.X and B.X will be initialized to 10 and A.Y and B.Y will be initialized to 25. But it is not so (obviously - otherwise it would not be worth mentioning). When running this program:

public class TestCircularInitialization {
    public static void main(String[] args) {
        int x = A.X;
        System.out.println("X = " + B.X);
        System.out.println("Y = " + A.Y);
    }
}

the output is:

X = 0
Y = 25

So what happens. int x = A.X will start A initialization. Initialization will try to execute line 2 in the class A. That will start B initialization and executing line 7. And that will cause another try to initialize A. But A is in the INITIALIZING state and the initializing thread is the same as the current one (point 3. of the procedure). Thread returns to line 7. and sets B.X with current value of A.X - the default value 0. Then it executes line 8 and ends B initialization where B.X = 0 and B.Y = 25. A initialization continues and A.Y is set to 25. And then A.X is set to 10 at last.

Interesting thing happens when TestCircularInitialization is modified a bit - int x = B.X:

public class TestCircularInitialization {
    public static void main(String[] args) {
        int x = B.X;
        System.out.println("X = " + B.X);
        System.out.println("Y = " + A.Y);
    }
}

the output is then:

X = 10
Y = 0

The explanation is equivalent to the previous example.

The example demonstrated why circular dependency is dangerous. The real behaviour is not obvious and depends on the execution path. The fix in this case would be reordering of static fields:

class A {
    static int X = 10;
    static int Y = B.Y;
}

class B {
    static int Y = 25;
    static int X = A.X;
}

In this form the real behaviour matches the expected behaviour. The best solution would be to avoid circular dependency altogether though.

The specification declares Class object is used for synchronization of class initialization state but it is not necessary so in reality. (Note: Following examples are JVM specific. Sun JDK 1.6.0_24 is used in the examples below.)

Let's run the following program:

class Foo {
    static {
        System.out.println("Foo initialized");
    }
}

public class TestStaticInitLockingBasic {
    public static void main(String[] args) throws InterruptedException {
        synchronized (Foo.class) {
            System.out.println("Foo locked");
            new Thread(new Runnable() {
                public void run() {
                    new Foo();
                }
            }).start();
            Thread.sleep(2000);
            System.out.println("Unlocking Foo");
        }
    }
}

This program locks on Foo.class object at first then it fires second thread. 1st thread then sleeps for 2s. 2nd thread cause Foo static initialization. At that moment 2nd thread should wait for 1st thread to unlock Foo.class and then continue with initialization. So at the end the console should contain:

Foo locked
Unlocking Foo
Foo initialized

But program prints:

Foo locked
Foo initialized
Unlocking Foo

It proves JVM does not use Class object as prescribed in specification. Now let's modify program a bit:

class Foo {
    static {
        synchronized (Foo.class) {
            System.out.println("Foo initialized");
        }
    }
}

public class TestStaticInitLocking {
    public static void main(String[] args) throws InterruptedException {
        synchronized (Foo.class) {
            System.out.println("Foo locked");
            new Thread(new Runnable() {
                public void run() {
                    new Foo();
                }
            }).start();
            Thread.sleep(2000);
            System.out.println("Main thread woke up - creating Foo instance");
            new Foo();
            System.out.println("Unlocking Foo");
        }
    }
}

There are 2 differences from previous code - Foo's static initialization has additional synchronization on Foo.class and main thread is creating the new Foo instance. Program will print:

Foo locked
Main thread woke up - creating Foo instance

and then program hangs. The program is using only a single lock and JVM is expected to use a synchronization for static initialization. It means JVM is using some lock other than Foo.class and program hangs because the threads are locking Foo.class and JVM's lock in different ordering. Another interesting thing is that thread dump does not detect any deadlock. This means this internal JVM's lock is not using standard JVM's lock mechanisms.

Existing JVM implementation apparently breaks JLS specification but this situation should be officially fixed soon. Surprisingly it won't happen by JVM implementation change but by JVM specification change. A new draft version of JVM specification (JSR‑000924 Java Virtual Machine Specification Java SE 7 Maintenance Review Draft 3) includes updated version of initialization procedure. This version already don't insist on using Class object for synchronization. The Draft changelog comments on this change:

The initialization procedure for a class or interface relaxes the requirement to lock on the user-visible Class object of the class or interface being initialized. Few JVM implementations actually lock on this object. Most use an internal lock invisible to user code, and this practice is now permitted by the initialization procedure.

There is the quote from the draft - the discussed part is highlighted.

Because the Java virtual machine is multithreaded, initialization of a class or interface requires careful synchronization, since some other thread may be trying to initialize the same class or interface at the same time. There is also the possibility that initialization of a class or interface may be requested recursively as part of the initialization of that class or interface. The implementation of the Java virtual machine is responsible for taking care of synchronization and recursive initialization by using the following procedure. It assumes that the Class object has already been verified and prepared, and that the Class object contains state that indicates one of four situations:

  • This Class object is verified and prepared but not initialized.
  • This Class object is being initialized by some particular thread.
  • This Class object is fully initialized and ready for use.
  • This Class object is in an erroneous state, perhaps because initialization was attempted and failed.

For each class or interface C, there is a unique initialization lock LC. The mapping from C to LC is left to the discretion of the Java virtual machine implementation. The procedure for initializing C is then as follows:

  1. Synchronize on the initialization lock, LC, for C. This involves waiting until the current thread can acquire LC.
  2. If the Class object for C indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed, at which time repeat this procedure.
  3. If the Class object for C indicates that initialization is in progress for C by the current thread, then this must be a recursive request for initialization. Release LC and complete normally.
  4. If the Class object for C indicates that C has already been initialized, then no further action is required. Release LC and complete normally.
  5. If the Class object for C is in an erroneous state, then initialization is not possible. Release LC and throw a NoClassDefFoundError.
  6. Otherwise, record the fact that initialization of the Class object for C is in progress by the current thread, and release LC. Then, initialize each final static field of C with the constant value in its ConstantValue attribute (§4.7.2), in the order the fields appear in the ClassFile.
  7. Next, if C is a class rather than an interface, and its superclass SC has not yet been initialized, then recursively perform this entire procedure for SC. If necessary, verify and prepare SC first. If the initialization of SC completes abruptly because of a thrown exception, then acquire LC, label the Class object for C as erroneous, notify all waiting threads, release LC, and complete abruptly, throwing the same exception that resulted from initializing SC.
  8. Next, determine whether assertions are enabled for C by querying its defining class loader.
  9. Next, execute the class or interface initialization method of C.
  10. If the execution of the class or interface initialization method completes normally, then acquire LC, label the Class object for C as fully initialized, notify all waiting threads, release LC, and complete this procedure normally.
  11. Otherwise, the class or interface initialization method must have completed abruptly by throwing some exception E . If the class of E is not Error or one of its subclasses, then create a new instance of the class ExceptionInInitializerError with E as the argument, and use this object in place of E in the following step. If a new instance of ExceptionInInitializerError cannot be created because an OutOfMemoryError occurs, then use an OutOfMemoryError object in place of E in the following step.
  12. Acquire LC, label the Class object for C as erroneous, notify all waiting threads, release LC, and complete this procedure abruptly with reason E or its replacement as determined in the previous step.

An implementation may optimize this procedure by eliding the lock acquisition in step 1 (and release in step 4/5) when it can determine that the initialization of the class has already completed, provided that, in terms of the memory model, all happens-before orderings (JLS3 §17.4.5) that would exist if the lock were acquired, still exist when the optimization is performed.

Now back to static initialization specification. It continues with the last section 12.4.3.

12.4.3 Initialization: Implications for Code Generation

Code generators need to preserve the points of possible initialization of a class or interface, inserting an invocation of the initialization procedure just described. If this initialization procedure completes normally and the Class object is fully initialized and ready for use, then the invocation of the initialization procedure is no longer necessary and it may be eliminated from the code-for example, by patching it out or otherwise regenerating the code.

Compile-time analysis may, in some cases, be able to eliminate many of the checks that a type has been initialized from the generated code, if an initialization order for a group of related types can be determined. Such analysis must, however, fully account for concurrency and for the fact that initialization code is unrestricted.

This section allows JVM to eliminate synchronization on initialized classes. Please note the quote from JavaSE7 Draft provides the means for the same optimization at the end.

Overall the static initialization has some tricky areas but these can be usually avoided. It is just another area where "keep it simple, ..." simply works.

Wednesday, March 2, 2011

Deadlock despite consistent lock acquisition ordering

Consistent lock acquisition ordering (further as lock ordering) is one of the techniques of deadlock prevention. Deadlock is caused by circular dependency of threads trying to acquire locks. The consistent ordering prevents such dependency.

There is a common example of code which is vulnerable to deadlock:

public void someMethod() {
    synchronized (lock1) {
        synchronized (lock2) {
            // do something here
        }
    }
}

public void someOtherMethod() {
    synchronized (lock2) {
        synchronized (lock1) {
            // do something here
        }
    }
}

If 2 threads concurrently execute these methods they can end up in deadlock where thread1 holds lock1 while waiting for lock2 and thread2 holds lock2 while waiting for lock1. More detailed explanation of deadlock can be found for example here.

This article will use even simpler model code:

public void someMethod() {
    synchronized (lock1) {
        synchronized (lock2) {
            // do something here
        }
    }
}

This code can be executed concurrently by more threads without the risk of deadlock as threads acquire the locks in the same order. It is not 100% deadlock proof though because the lock ordering can be broken by wait(). This program:

public class BrokenOrderingBasic implements Runnable {
    private Object lock1 = new Object();
    private Object lock2 = new Object();

    public static void main(String[] args) throws InterruptedException {
        BrokenOrderingBasic runnable = new BrokenOrderingBasic();
        Thread thread1 = new Thread(runnable, "thread1");
        Thread thread2 = new Thread(runnable, "thread2");
        thread1.start();
        Thread.sleep(500);
        thread2.start();
    }

    @Override
    public void run() {
        try {
            String threadName = Thread.currentThread().getName();
            synchronized (lock1) {
                System.out.println(threadName + " has lock1");
                synchronized (lock2) {
                    System.out.println(threadName + " has lock2");
                    lock1.wait(1000);
                }
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

prints:

thread1 has lock1
thread1 has lock2
thread2 has lock1

and hangs. Thread dump confirms a deadlock:

Found one Java-level deadlock:
=============================
"thread2":
  waiting to lock monitor 0x092807b0 (object 0x9f182210, a java.lang.Object),
  which is held by "thread1"
"thread1":
  waiting to lock monitor 0x09281fe8 (object 0x9f182208, a java.lang.Object),
  which is held by "thread2"

The reason is wait() will break the lock ordering. The program's lock ordering is lock1 -> lock2. thread1 acquires lock1 and lock2 at first. But then it releases lock1 by lock1.wait() while still holding lock2. When thread1 later timeouts from wait() it will try to acquire lock1 again and that way it practically establishes lock ordering lock2 -> lock1. As lock1 was acquired by thread2 in the meantime we have standard deadlock scenario here.

The above example is not quite realistic scenario as calling wait() on outer lock does not make much sense. If it appeared in real code, it would be very likely a bug.

Let us look on variation of the first scenario. This program:

public class BrokenOrderingReentredLock implements Runnable {
    private Object lock1 = new Object();
    private Object lock2 = new Object();

    public static void main(String[] args) throws InterruptedException {
        BrokenOrderingReentredLock runnable = new BrokenOrderingReentredLock();
        Thread thread1 = new Thread(runnable, "thread1");
        Thread thread2 = new Thread(runnable, "thread2");
        thread1.start();
        Thread.sleep(500);
        thread2.start();
    }

    @Override
    public void run() {
        try {
            String threadName = Thread.currentThread().getName();
            synchronized (lock1) {
                System.out.println(threadName + " has lock1");
                synchronized (lock2) {
                    System.out.println(threadName + " has lock2");
                    synchronized (lock1) {
                        System.out.println(threadName + " reenters lock1");
                        lock1.wait(1000);
                    }
                }
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

prints:

thread1 has lock1
thread1 has lock2
thread1 reenters lock1
thread2 has lock1

and it hangs too:

Found one Java-level deadlock:
=============================
"thread2":
  waiting to lock monitor 0x02b7fdcc (object 0x229bf910, a java.lang.Object),
  which is held by "thread1"
"thread1":
  waiting to lock monitor 0x02b816fc (object 0x229bf908, a java.lang.Object),
  which is held by "thread2"

The reasoning is the same as for the previous scenario. The program's lock ordering is lock1 -> lock2. thread1 acquires lock1 and lock2 and reenters lock1. It releases lock1 by lock1.wait() while still holding lock2. When it later timeouts from wait() it will try to acquire lock1 again and that way it practically establishes lock ordering lock2 -> lock1. lock1 was acquired by thread2 in the meantime.

This scenario does not seem to be realistic either but the lock acquisitions can spread over multiple methods. In case of the complex code base + 3rd party libraries/frameworks is such situation more likely. On the other hand it also strongly suggests some redesign is needed.

The deadlock "property" of wait() is not limited to Java built-in synchronization and monitors. It is inherent with any wait-like feature (suspend the current thread and release the associated lock). The following program is equivalent to the previous one but using the java.util.concurrent.locks package:

import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.ReentrantLock;

public class BrokenOrderingReentrantLock implements Runnable {
    private ReentrantLock lock1 = new ReentrantLock();
    private ReentrantLock lock2 = new ReentrantLock();
    private Condition condition = lock1.newCondition();

    public static void main(String[] args) throws InterruptedException {
        BrokenOrderingReentrantLock runnable = new BrokenOrderingReentrantLock();
        Thread thread1 = new Thread(runnable, "thread1");
        Thread thread2 = new Thread(runnable, "thread2");
        thread1.start();
        Thread.sleep(500);
        thread2.start();
    }

    @Override
    public void run() {
        try {
            String threadName = Thread.currentThread().getName();
            lock1.lock();
            try{
                System.out.println(threadName + " has lock1");
                lock2.lock();
                try{
                    System.out.println(threadName + " has lock2");
                    lock1.lock();
                    try{
                        System.out.println(threadName + " reenters lock1");
                        condition.await(1, TimeUnit.SECONDS);
                    } finally {
                        lock1.unlock();
                    }
                } finally {
                    lock2.unlock();
                }
            } finally {
                lock1.unlock();
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}
When executed it behaves the same way. It prints:
thread1 has lock1
thread1 has lock2
thread1 reenters lock1
thread2 has lock1

and it hangs too:

Found one Java-level deadlock:
=============================
"thread2":
  waiting for ownable synchronizer 0x9f142860,
  (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "thread1"
"thread1":
  waiting for ownable synchronizer 0x9f142838,
  (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by "thread2"

To summarize - the consistent lock acquisition ordering helps to prevent deadlock but one has to be aware that wait() can mess up with seemingly obvious correct ordering. But if such situation happens it is probably a bug or a superficial code design.