We’ve been doing a lot of Hibernate work at Terracotta recently and that naturally includes a fair amount of performance testing. In Hibernate, you can grab those stats using the Statistics object for a SessionFactory via getStatistics(). There are all sorts of tasty morsels inside this class to get factory-wide counts and per-entity/query/cache/etc stats. Cool stuff.

We noticed however while doing some perf testing that the code inside the StatisticsImpl is a bit problematic from a concurrency point of view. The basic gist of the class is just a set of stat counters. I can simplify the pattern to this without much loss of detail:

public class StatisticsImpl implements Statistics, StatisticsImplementor { private long queryExecutionCount; public synchronized void queryExecuted(String hql, int rows, long time) { queryExecutionCount++; // ... bunch of other stat collection } public long getQueryExecutionCount() { return queryExecutionCount; } public synchronized void clear() { queryExecutionCount = 0; // ... clear all other stats } }

That’s basically what’s in this class but it’s repeated a couple dozen times for other counters and there are Maps used for per-item counts of a few kinds of items. There are several problems here from a concurrency point of view:

Coarse-grained lock – the entire class shares a single lock via synchronized methods. That means that every counter in every thread is contending for that same lock. The impact of this is that when you turn on statistics collection in Hibernate, you immediately introduce a big old stop-the-world contention point across all threads. This will have an impact (possibly a very significant one) on the very stats you are trying to collect. It’s entirely possible that the scale of your application or other bottlenecks mean that your application is not actually seeing this as a bottleneck, but it should scare you at least a little.

At the very least here, we could be used fine-grained locking to avoid creating contention between different kinds of statistics. There are also collections in here that collect stats on a per-query, per-entity, per-collection, etc basis. Those could actually have fine-grained locks per-entity as well (but they don’t).

Dirty reads – you’ll notice that while writes to the counters are synchronized, the reads are not. Presumably this was done for performance. Unfortunately, it’s also quite broken from a memory model point of view. These reads are not guaranteed to see writes from any other thread, so the values you’re seeing are possibly stale. In fact, it’s possible that you are never seeing any of the counter updates. In practice, the synchronization on puts is probably causing the local caches to get flushed and on the hardware we’re running, you do seem to see values that are in the ballpark at least. But the Java memory model makes no guarantee that this will work on all architectures or at any point in the future. Race condition on clear() – the common way that the stats are used is with some gui or other monitor sitting in a loop and periodically reading some (or all) of the stats, then calling clear(). Because time passes between the read of the first stat and the clear, you will lose all updates to the stats during the course of the reads.

You may be willing to neglect a few lost updates, but consider that in many cases the monitor thread may iterate through every entity, collection, and query updated since the last loop (potentially hundreds of reads). In the cases where per-item stats are looked up, the gets() are actually synchronized as well when finding the stat in a Map. Those gets are synchronized against all other puts happening in the system. So the scope of that “read all stats” part of the monitor code may actually be quite large and you will lose all updates made between the beginning of that and the clear(), which distorts the next set of stats to an unknown degree (more activity == more distortion).

[UPDATE] Dirty long read – As Peter mentioned in the comments, the values being read here are longs and since longs and doubles are 64-bit values, dirty reads of them are not guaranteed to see atomic writes from other threads. So you could see different 32-bit chunks that were not written together. Reads/writes of shared doubles and longs should always be done with synchronized or volatile to address this issue.

I certainly understand that stats are seen as best-effort and that the existing Hibernate code supports pre-1.5 Java and does not have access to Java 5 concurrency utilities like Atomic classes and concurrent collections, but even so I think there are things that could be done here to make stats collection a lot more accurate and less invasive. Things like fine-grained or more concurrent locking (with volatile, AtomicInteger, or ReentrantReadWriteLock) would go a long way towards fixing visibility while increasing concurrency.

In our Terracotta integration, I suspect we will be using some byte-code instrumentation to clean up these classes (we already assume Java 1.5+ so that is a constraint that we are happy to break). As it is, we don’t currently trust the stats in the first place and second we are actually seeing the single lock showing up as a hot spot in performance tests.

I hope that the next version of Hibernate (which I think is dropping pre-1.5 support?) can make some improvements as well.