Over a million developers have joined DZone.

Locking and Logging

Bitbucket is for the code that takes us to Mars, decodes the human genome, or drives your next car. What will your code do? Get started with Bitbucket today, it's free.

Plumbr has been known as the tool to tackle memory leaks. As little as two months ago we released GC optimization features. But we have not been sitting idle after this – for months we have been working on lock contention detection.

From the test runs we have discovered many awkward concurrency issues in hundreds of different applications. Many of those issues are unique to the application at hand, but one particular type of issues stands out.

What we found out was that almost every Java application out there is using either Log4j or Logback. As a matter of fact, from the data we had available, it appears to be that more than 90% of the applications are using either of those frameworks for logging. But this is not the interesting part. Interesting is the fact that about third of those applications are facing rather significant lock wait times during logging calls. As it stands, more than 10% of the Java applications seem to halt for more than 5,000 milliseconds every once in a while during the innocent-looking log.debug() call.

Why so? Default choice of an appender for any server environment is some sort of File appender, such as RollingFileAppender for example. What is important is the fact that these appenders are synchronized. This is an easy way to guarantee that the sequence of log entries from different threads is preserved.

To demonstrate the side effects for this approach, we setup a simple JMH test (MyBenchmark) which is doing nothing besides calling log.debug(). This benchmark was ran on a quad-core MacBook Pro with 1,2 and 50 threads. 50 threads was chosen to simulate a typical setup for a servlet application with 50 HTTP worker threads.

public class LogBenchmark {

  static final Logger log = LoggerFactory.getLogger(LogBenchmark.class);

  AtomicLong counter;

  public void testMethod() {

  public void setup() {
    counter = new AtomicLong(0);

  public void printState() {
    System.out.println("Expected number of logging lines in debug.log: " + counter.get());

From the test results we see a dramatic decrease in throughput 278,898 ops/s -> 84,630 ops/s -> 73,789 ops/s we can see that going from 1 to 2 threads throughput of the system decreases 3.3x.

So how can you avoid this kind of locking issues?  The solution is simple – for more than a decade there has been an appender called AsyncAppender present in logging frameworks. The idea behind this appender is to store the log message in the queue and return the flow back to the application. In such a way the framework can deal with storing the log message asynchronously in a separate thread.

Let’s see how AsyncAppender can cope with multithreaded application. We set up similar simple benchmark but configure the logger for that class to use AsyncAppender. Now, when we run the benchmark with the same 1, 2 and 50 threads we get stunning results: 4,941,874 ops/s -> 6,608,732 ops/s -> 5,517,848 ops/s. The improvement in throughput is so good, that it raises suspicion that there’s something fishy going on.

Let’s look at the documentation of the AsyncAppender. It says AsyncAppender is by default a lossy logger, meaning that when the logger queue will get full, appender will start dropping trace, debug and info level messages, so that warnings and errors would surely get written. This behavior is configured using 2 parameters – discardingThreshold and queueSize. First specifies how full should be the queue when messages will start to be dropped, second obviously specifies how big is the queue.

The default queue size set to 256 can for example configured to 0 disabling discarding altogether so that the appender becomes blocking when the queue will get full. To better understand the results, let’s count the expected number of messages in the log file (as the number of benchmark invocations by the JMH is non-deterministic) and then compare how many were actually written to see how many messages are actually discarded to get such brilliant throughput.

We run the benchmark with 50 threads, varied the queue size and turned discarding on and off. The results are as follows:

Queue size Discard No discard
Ops/s Expected msg Actual msg Lost msg Ops/s
256 4,180,312 184,248,790 1,925,829 98.95% 118340
4,096 4,104,997 182,404,138 694,902 99.62% 113534
65,536 3,558,543 157,385,651 1,762,404 98.88% 137583
1,048,576 3,213,489 141,409,403 1,560,612 98.90% 117820
2,000,000 3,306,476 141,454,871 1,527,133 98.92% 108603

What can we conclude from them? There’s no free lunches and no magic. Either we discard 98% of the log messages to get such massive throughput gain, or when the queue fills, we start blocking and fall back to performance comparable to synchronous appender. Interestingly the queue size doesn’t affect much., In case you can sacrifice the debug logs, using AsyncAppender does make sense.

In case you enjoyed the content, we recommend to subscribe to either our RSS or Twitter feeds. We keep publishing interesting content around Java performance tuning domain. Or, if you are eager to get your hands on to the lock contention detection functionality – let us know and we will add you to the private beta program.

Bitbucket is the Git solution for professional teams who code with a purpose, not just as a hobby. Get started today, it's free.


Published at DZone with permission of Nikita Salnikov-tarnovski, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}