Microbenchmarking with JMH: Measure, Don't Guess!
Join the DZone community and get the full member experience.
Join For Freei’m sure you’ve all heard that assigning a variable to null helps the garbage collector, or not declaring a method final improves in lining…. but what you also know is that jvms have evolved drastically and what was true yesterday may not be true today . so, how do we know that our code performs? well, we don’t, because we are not supposed to guess what the jvm does… we just measure!
measure, don’t guess!
as my friend kirk pepperdine once said , “ measure, don’t guess “. we’ve all faced performance problems in our projects and were asked to tune random bits in our source code… hoping that performance will get improved. instead, we should setup a stable performance environment (operating system, jvm, application server, database…), measure continuously, set some performance goals… then, take action when our goals are not achieved. continuous delivery, continuous testing… is one thing, but continuous measuring is another step.
anyway, performance is a dark art and it’s not the goal of this blog post. no, i just want to focus on microbenchmarking and show you how to use jmh on a real use case: logging .
microbenchmarking logging
i’m sure that, like me, you’ve spent the last decades going from one logging framework to another one and you’ve seen different ways to write debug logs:
logger.debug("concatenating strings " + x + y + z); logger.debug("using variable arguments {} {} {}", x, y, z); if (logger.isdebugenabled()) logger.debug("using the if debug enabled {} {} {}", x, y, z);
these are all debug messages and we usually don’t care because in production we run with an info or warning level. but debug logs can have an impact on our performances… even if we are in warning level. to prove it, we can use java microbenchmarking harness (jmh) to make a quick microbenchmark and measure the performance of the three logging mechanism: concatenating strings, using variable arguments and using the if debug enabled .
setting up jmh
jmh is a java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in java and other languages targeting the jvm. it’s really easy to setup and, thanks to the maven archetype , we can quickly get a jmh project skeleton and get going. for that, execute the following maven command:
$ mvn archetype:generate -dinteractivemode=false -darchetypegroupid=org.openjdk.jmh \ -darchetypeartifactid=jmh-java-benchmark-archetype -darchetypeversion=1.4.1 \ -dgroupid=org.agoncal.sample.jmh -dartifactid=logging -dversion=1.0
this maven archetype creates the following project structure:
- a pom.xml file with the jmh dependencies and a customized maven-shade-plugin to get a uber-jar
- an empty mybenchmark class with a @benchmark annotation
at this point we haven’t done anything yet, but the microbenchmark project is already up and running. packaging the code with maven will create a uber-jar called benchmarks.jar.
$ mvn clean install $ java -jar target/benchmarks.jar
when we execute the uber-jar, we see a funny output in the console: jmh goes into a loop, warms up the jvm, executes the code inside the method annotated @benhmark (empty method for now) and gives us the number of operations per seconds
# run progress: 30,00% complete, eta 00:04:41 # fork: 4 of 10 # warmup iteration 1: 2207650172,188 ops/s # warmup iteration 2: 2171077515,143 ops/s # warmup iteration 3: 2147266359,269 ops/s # warmup iteration 4: 2193541731,837 ops/s # warmup iteration 5: 2195724915,070 ops/s # warmup iteration 6: 2191867717,675 ops/s # warmup iteration 7: 2143952349,129 ops/s # warmup iteration 8: 2187759638,895 ops/s # warmup iteration 9: 2171283214,772 ops/s # warmup iteration 10: 2194607294,634 ops/s # warmup iteration 11: 2195047447,488 ops/s # warmup iteration 12: 2191714465,557 ops/s # warmup iteration 13: 2229074852,390 ops/s # warmup iteration 14: 2221881356,361 ops/s # warmup iteration 15: 2240789717,480 ops/s # warmup iteration 16: 2236822727,970 ops/s # warmup iteration 17: 2228958137,977 ops/s # warmup iteration 18: 2242267603,165 ops/s # warmup iteration 19: 2216594798,060 ops/s # warmup iteration 20: 2243117972,224 ops/s iteration 1: 2201097704,736 ops/s iteration 2: 2224068972,437 ops/s iteration 3: 2243832903,895 ops/s iteration 4: 2246595941,792 ops/s iteration 5: 2241703372,299 ops/s iteration 6: 2243852186,017 ops/s iteration 7: 2221541382,551 ops/s iteration 8: 2196835756,509 ops/s iteration 9: 2205740069,844 ops/s iteration 10: 2207837588,402 ops/s iteration 11: 2192906907,559 ops/s iteration 12: 2239234959,368 ops/s iteration 13: 2198998566,646 ops/s iteration 14: 2201966804,597 ops/s iteration 15: 2215531292,317 ops/s iteration 16: 2155095714,297 ops/s iteration 17: 2146037784,423 ops/s iteration 18: 2139622262,798 ops/s iteration 19: 2213499245,208 ops/s iteration 20: 2191108429,343 ops/s
adding slf4j to the benchmark
remember that the use case is to microbench logging. in the created project i use sfl4j with logback. so i need to add those dependencies to the pom.xml :
<dependency> <groupid>org.slf4j</groupid> <artifactid>slf4j-api</artifactid> <version>1.7.7</version> </dependency> <dependency> <groupid>ch.qos.logback</groupid> <artifactid>logback-classic</artifactid> <version>1.0.11</version> </dependency>
then i add a logback.xml file which outputs only info logs (so i’m sure that the debug level traces are not logged) :
<configuration> <appender name="console" class="ch.qos.logback.core.consoleappender"> <encoder> <pattern>%highlight(%d{hh:mm:ss.sss} [%thread] %-5level %logger - %msg%n)</pattern> </encoder> </appender> <appender name="stdout" class="ch.qos.logback.core.consoleappender"> <encoder><pattern>%msg%n</pattern></encoder> </appender> <root level="info"> <appender-ref ref="console" /> </root> </configuration>
the good thing with the maven-shade-plugin is that when i package the application, all the dependencies, configuration files and so on, will all get flatten into the uber-jar target/benchmarks.jar.
using string concatenation in the logs
let’s do the first micro benchmark: using logs with string concatenation. the idea here is to take the mybenchmark class and add the needed code into the method annotated with @benchmark, and let jmh do the rest. so, we add a logger, create a few string (x, y, z), do a loop, and log a debug message with string concatenation. this will look like this:
import org.openjdk.jmh.annotations.benchmark; import org.slf4j.logger; import org.slf4j.loggerfactory; public class mybenchmark { private static final logger logger = loggerfactory.getlogger(mybenchmark.class); @benchmark public void testconcatenatingstrings() { string x = "", y = "", z = ""; for (int i = 0; i < 100; i++) { x += i; y += i; z += i; logger.debug("concatenating strings " + x + y + z); } } }
to execute this micro benchmark, we do as usual, and we will see the iteration outputs :
$ mvn clean install $ java -jar target/benchmarks.jar
using variable arguments in the logs
the second micro-benchmark is to use variable arguments in the logs instead of string concatenation. just change the code, repackage, and execute it.
@benchmark public void testvariablearguments() { string x = "", y = "", z = ""; for (int i = 0; i < 100; i++) { x += i; y += i; z += i; logger.debug("variable arguments {} {} {}", x, y, z); } }
using a if statement in the logs
last but not least, the good old isdebugenabled() in the logs that is “supposed to optimize things”.
@benchmark public void testifdebugenabled() { string x = "", y = "", z = ""; for (int i = 0; i < 100; i++) { x += i; y += i; z += i; if (logger.isdebugenabled()) logger.debug("if debug enabled {} {} {}", x, y, z); } }
result of the microbenchmarks
after running the three micro-benhmarks we get what we had expected (remember, don’t guess, measure). the more operation per second, the better. so if we look at the last line of the following table, we notice that the best performance is with the isdebugenabled and the worse is string concatenation. then, as we can see, variable argument without isdebugenabled is not bad either… plus we gain in visibility (less boiler plate code). so i’ll go with variable arguments !
string concatenation | variable arguments | if isdebugenabled | |
iteration 1 | 57108,635 ops/s | 97921,939 ops/s | 104993,368 ops/s |
iteration 2 | 58441,293 ops/s | 98036,051 ops/s | 104839,216 ops/s |
iteration 3 | 58231,243 ops/s | 97457,222 ops/s | 106601,803 ops/s |
iteration 4 | 58538,842 ops/s | 100861,562 ops/s | 104643,717 ops/s |
iteration 5 | 57297,787 ops/s | 100405,656 ops/s | 104706,503 ops/s |
iteration 6 | 57838,298 ops/s | 98912,545 ops/s | 105439,939 ops/s |
iteration 7 | 56645,371 ops/s | 100543,188 ops/s | 102893,089 ops/s |
iteration 8 | 56569,236 ops/s | 102239,005 ops/s | 104730,682 ops/s |
iteration 9 | 57349,754 ops/s | 94482,508 ops/s | 103492,227 ops/s |
iteration 10 | 56894,075 ops/s | 101405,938 ops/s | 106790,525 ops/s |
average | 57491,4534 ops/s | 99226,5614 ops/s | 104913,1069 ops/s |
conclusion
in the last decades jvms
have evolved drastically
. design pattern that would optimize our code ten years ago are not accurate anymore. the only way to be sure that one piece of code is better that another piece of code, is to measure it. jmh is the perfect tool to easily and quickly micro benchmark pieces of code, or like in this post, an external framework (logging, utility classes, date manipulation, apache commons….). of course, reasoning about a small section of code is only one step because we usually need to analyze the overall application performance. thanks to jmh this first step is easy to make.
and remember to check the jmh examples , it’s full of interesting ideas.
references
Published at DZone with permission of Antonio Goncalves, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments