To blog Previous post | Next post

Locking and Logging

Plumbr is mostly known as the tool to tackle memory leaks. Just two months ago we released GC optimization features. However, we have not been sitting idle after that – already for months we have also been working on lock contention detection.

From our test runs we have discovered many awkward concurrency issues in hundreds of different applications. Many of those issues are unique to the applications at hand, but one particular type of issues stands out.

What we found out was that almost every Java application out there is using either Log4j or Logback. As a matter of fact, from the data we had available, it appears to be that more than 90% of the applications are using either of these frameworks for logging. But this is not the interesting part. Interesting is the fact that about a third of these applications experience rather significant lock wait times during logging calls. As it stands, more than 10% of Java applications seem to halt for more than 5,000 milliseconds every once in a while during the innocent-looking log.debug() call.

Why so? The default choice of an appender for any server environment is some sort of File appender, such as RollingFileAppender. What is important is the fact that these appenders are synchronized. This is an easy way to guarantee that the sequence of log entries from different threads is being preserved.

To demonstrate the side effects for this approach, we set up a simple JMH test (MyBenchmark) which does nothing more than calls log.debug(). This benchmark was ran on a quad-core MacBook Pro with 1, 2, and 50 threads. 50 threads was chosen to simulate a typical setup for a servlet application with 50 HTTP worker threads.

@State(Scope.Benchmark) public class LogBenchmark { static final Logger log = LoggerFactory.getLogger(LogBenchmark.class); AtomicLong counter; @Benchmark public void testMethod() { log.debug(String.valueOf(counter.incrementAndGet())); } @Setup public void setup() { counter = new AtomicLong(0); } @TearDown public void printState() { System.out.println("Expected number of logging lines in debug.log: " + counter.get()); } }

From the test results we see a dramatic decrease in throughput 278,898 ops/s -> 84,630 ops/s -> 73,789 ops/s we can see that going from 1 to 2 threads throughput of the system decreases 3.3x.

So how can you avoid this kind of locking issues? The solution is simple – for more than a decade there has been an appender called AsyncAppender present in logging frameworks. The idea behind this appender is to store the log message in the queue and return the flow back to the application. In such a way the framework can deal with storing the log message asynchronously in a separate thread.

Did you know that 16% of Java applications face degraded user experience due to lock contention? Don’t blame the locks – detect them with Plumbr instead.

Let’s see how AsyncAppender can cope with multithreaded applications. We set up a similar simple benchmark, but configured the logger for that class to use AsyncAppender. When we ran the benchmark with the same 1, 2 and 50 threads we got stunning results: 4,941,874 ops/s -> 6,608,732 ops/s -> 5,517,848 ops/s. The improvement in throughput was so good that we started to suspect that there was something fishy going on.

Let’s look at the documentation of the AsyncAppender. It says that AsyncAppender is by default a lossy logger, meaning that when the logger queue gets full, the appender will start dropping trace, debug and info level messages, so that warnings and errors would surely be written. This behavior can be configured using 2 parameters – discardingThreshold and queueSize. The first specifies how full should the queue be when it begins to be drop messages, and the second one obviously specifies how big the queue itself is.

The default queue size set to 256 can for example be configured to 0 disabling discarding altogether so that the appender would start blocking when the queue gets full. To better understand the results, let’s count the expected number of messages in the log file (as the number of benchmark invocations by the JMH is non-deterministic) and then compare how many were actually written to see how many messages are actually discarded to get such brilliant throughput.

We ran the benchmark with 50 threads, varied the queue size and turned discarding on and off. The results were as follows:

Queue size Discard No discard Ops/s Expected msg Actual msg Lost msg Ops/s 256 4,180,312 184,248,790 1,925,829 98.95% 118340 4,096 4,104,997 182,404,138 694,902 99.62% 113534 65,536 3,558,543 157,385,651 1,762,404 98.88% 137583 1,048,576 3,213,489 141,409,403 1,560,612 98.90% 117820 2,000,000 3,306,476 141,454,871 1,527,133 98.92% 108603

What can we conclude from this? There are no free lunches and there is no magic. Either we discard 98% of the log messages to get such massive throughput gain, or when the queue fills, we start blocking and fall back to performance comparable to synchronous appender. Interestingly the queue size does not affect the results much. In case you can sacrifice debug logs, using AsyncAppender makes a lot of sense.

In case you enjoyed the content, we recommend to subscribe to either our RSS or Twitter feeds. We keep publishing interesting content around Java performance tuning domain. Or, if you are eager to get your hands on to the lock contention detection functionality – let us know and we will add you to the private beta program.