A Comfortable Couch

Monday, February 07, 2005

Double-checked locking in Java is broken

Double-checked locking and the Singleton pattern

The following code is broken in Java:
public static Singleton getInstance()
{
if (instance == null) {
synchronized(Singleton.class) {
if (instance == null)
instance = new Singleton();
}
}
return instance;
}
The purpose of the above code is to ensure that only one Singleton instance is constructed and all callers of getInstance() are returned that instance, it is optimized to do it with as little overhead as possible. I've used the exact same technique in C++.

Apparently, many (most?) Java JIT compilers will perform an optimization that an object reference will get written after the object memory is allocated but before the actual object is constructed. With the code above, that means it's possible for a caller of getInstance() to receive a object reference that isn't actually constructed yet. As bad as that is, it seems like it would cause all kinds of problems in all kinds of situations, and I'm not sure what happens when you invoke a method on a uninitialized object.

It's stuff like this that makes me wonder if critical software should be written in anything other than C, it seems like the more advanced a language is (read: complicated) the more likely you are to get bitten by some odd unexpected behaviour. How do we build highly reliable software if the sematics of the programming language aren't completely known?

Update: Digging a little further, it looks like nearly all imperative multithreaded languages suffer from this problem on multi-processor machines. The cache on processor can be out of date with actual memory if two threads running different processors are reading and writing the same memory at the same time. Yikes!

5 Comments:

Bob said...

This post has been removed by the author.

11:49 PMlink  
Bob said...

This post has been removed by the author.

11:51 PMlink  
Bob said...

It's too bad that the Double-Checked Locking Pattern (DCLP) idiom became popular but the fix is simple as described in the referenced article "The best solution to this problem is to accept synchronization or use a static field". Since static initializers aren't called until a class is loaded, this isn't as costly as it is in C++ where such code is called at program startup and is always called. Also, code analysis tools such as PMD can spot usage of the double-checked locking idiom so that it can be found and fixed.

Don't assume that this issue is unique to Java. C++ has its own issues with DCLP. See Scott Meyer's article:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

His conclusion? "...though DCLP isn’t intrinsically tied to Singleton, use of Singleton tends to lead to a desire to "optimize" thread-safe access via DCLP. You should therefore be sure to avoid implementing Singleton with DCLP." Hmm.

Some people would also take issue with your comment about only "writing critical software in C". Just think of all of the breakage potential in C/C++ with memory issues, stack overwrites, unbounded writes via strcpy and friends, unsafe casts, etc. And don't forget that some of these do more than just cause code to crash. They allow the sort of "code injection" exploits that have harmed critical software systems around the world.

I don't have anything against C/C++ per se. Bad programmers can write bad code in any programming language. C/C++ has its place. For example, I don't think we're at the stage yet where low-level code such as drivers, can be effectively written in Java, C#, etc.

11:53 PMlink  
Damien said...

Yes, digging a little further, it looks like nearly all imperative multithreaded languages suffer from this problem on multi-processor machines. The cache on processor can be out of date with actual memory if two threads running different processors are reading and writing the same memory at the same time. I think the issue here is that SMP designers decided to break backwards compatibility in order to gain a performance edge. Not that that was the wrong decision, but it had a cost.

BTW, this guy claims Python claims to be double lock friendly. But I wonder if he means on SMP systems too?

As far as C being better for highly reliable systems (I definitely didn't mean C++ though), the only reason I think that is because it is such a stable language, both in syntax and semantics. And as long as you adhere to a particular style of coding and are very strict in checking buffer lengths and for failed allocations and othr error handling, you can be sure that you code does exactly what it is supposed to.

But would I want to use Python in a critical system, like avionics software? Nope. Would I want to use it to build a database engine? Probably not, but I'm not sure. This is actually a decision I need to make soon.

2:20 AMlink  
Bob said...

C can be used effectively with very careful coding. It was originally intended as a system programming language and, for that, it's been extremely successful. But I believe that many programmers can't (or won't) exercise the amount of care and attention that are required to write quality code in C. C started out and continues a higher-level assembly language. Some language systems use it in this way. For example, the Smalltalk system called Squeak is bootstrapped from C code that's generated from Smalltalk.

9:39 AMlink  

Post a Comment

<< Home