No wonder Java multithreaded code is hard to write... no immutable collections in the Java std library.What are these people thinking? — David Pollak (@dpp) February 8, 2011

java.util.Collections.unmodifiableList

I've been working through the Java APIs for Lift, one of the things I've been struck with is the lack of immutable data structures in the core Java libraries. Today I made a Tweet:and got a bunch of replies from people pointing me towhich is the worst of all possible worlds.

Let's start with an immutable data structure in Java that we're all familiar with: String. When you have a reference to a String, you can call any method on that String without synchronizing it. You can pass the String to other threads and retain your original reference (no need to make a defensive copy.) Strings work so well in Java that many APIs take Strings as parameters and return Strings rather than more complex data structures. For example, most web frameworks are String-based. You emit Strings and hope that they result in valid HTML. This is compared with Lift which is DOM based and always has a well formed DOM that it transforms.

I've been doing Scala coding for almost 4 1/2 years and I've really come to appreciate Scala's "default to immutability." Basically, all the Scala collections are immutable. It's super easy to define immutable classes:

case class Person(name: String, age: Int)



And create them:

val p = Person("David", 46) // Person(David, 46)



And give them a birthday:

val p1 = p.copy(age = 47) // Person(David, 47)



If I've got a reference to the object originally assigned to p, I can access the fields and do whatever else I want with that instance without worrying about synchronization or anyone else changing the instance out from under me.

Scala also has immutable collections so that I can create a List:

val x = List(1,2,3)



And I can access that reference from any thread without synchronizing. This is very powerful because I can pass "the state at a particular time" to another thread or keep it around on my current thread without worry. Without making a defensive copy.

There are no built in Java collections that allow me to do this. Java does have java.util.Collections.unmodifiableList. This is the worst of all possible worlds. Let me illustrate:

scala> val l2 = new CopyOnWriteArrayList[String]()

l2: java.util.concurrent.CopyOnWriteArrayList[String] = []



scala> l2.add("Hello")

res10: Boolean = true



scala> l2.add("Hello2")

res11: Boolean = true



scala> l2

res12: java.util.concurrent.CopyOnWriteArrayList[String] = [Hello, Hello2]



scala> val l3 = Collections.unmodifiableList(l2)

l3: java.util.List[String] = [Hello, Hello2]



scala> l2.add("THree")

res13: Boolean = true



scala> l2

res14: java.util.concurrent.CopyOnWriteArrayList[String] = [Hello, Hello2, THree]



scala> l3

res15: java.util.List[String] = [Hello, Hello2, THree]







The receiving method (in this case, it's just a variable l3), cannot mutate the collection, but there are no guarantees that someone else cannot mutate it. There is no way to insure that you have a copy that cannot be mutated... even if you make a defensive copy of the collection the moment you get it, there's no guarantee that some other thread isn't mutating it while you're making your copy. This makes multi-threaded code very difficult to write.

Let's see the different between Scala code with immutable collections:



object ChatServer extends LiftActor with ListenerManager {

private var msgs = Vector( "Welcome" )

​

def createUpdate = msgs

​

override def lowPriority = {

case s : String = > msgs : + = s; updateListeners()

}

}

And Java code that has mutable collections:



* The chat server

*/

public class ChatServer extends LiftActorJWithListenerManager {

// private state. The messages

private ArrayList<String> msgs = new ArrayList<String>(); // the private data /*** The chat server*/public class ChatServer extends LiftActorJWithListenerManager {// private state. The messagesprivate ArrayList msgs = new ArrayList (); // the private data private ChatServer() {

msgs.add("Welcome");

} @Receive

/**

* A String is sent as a message to this Actor. Process it

*/

private void gotAString(String s) {

// add to the messages. No need to synchronize because

// the method will only be invoked by the Actor thread and

// only one message will be processed at once

msgs.add(s); // cap the length of our messages at 20 messages

if (msgs.size() > 20) {

msgs = new ArrayList(msgs.subList(msgs.size() - 20, msgs.size()));

} // update the listeners

updateListeners();

} // what we send to listeners on update

public Object createUpdate() {

// make a copy of the list and send it to the listeners as unmodifable

// Note, if we used Scala's immutable collections or even Functional

// Java's immutable List, we would not have to make a defensive copy

// of the messages

return Collections.unmodifiableList((List<String>) msgs.clone());

}

}



Note that when we send the "current state" message to our listeners, we have to clone the collection and mark it unmodifiable. This is slower than using Scala's immutable data structures.

When you think about code that is thread-safe, immutable data structures are the best mechanism for writing code. Clojure has the best immutable abstractions and a tremendous set of immutable data structures. Scala has great immutable data structures and reasonable library-based support for multi-threaded coding. Java has none of this. Further, every person that thinks java.util.Collections.unmodifiableList gives you immutability should not be writing multi-threaded code.