JMH Performance Testing InfinityDB
The Java Microbenchmarking Harness is a widely used, precise test for performance-critical code from OpenJDK. It is simple and fair. Let's write a test!
Join the DZone community and get the full member experience.
Join For FreeThe Java Microbenchmarking Harness is a widely used, precise test for performance-critical code from OpenJDK. It is simple and fair. Let's write a test!
We will test InfinityDB, which is a performance-oriented extended persistent ConcurrentNavigableMap used for example in database caching, time-series data capture, and text indexes. It is the DBMS of the Atlassian Fisheye repository browser.
In this article we will show:
How to write simple but fair performance benchmarks for critical code
Code for the tests for InfinityDB
Results of the InfinityDB test
Simple Code and Convenience
JMH simplifies test code greatly by observing Annotations on the test class, state variables, and test methods. It provides parameterization, per-trial, per-benchmark, and per-thread state setup, forking, overhead minimization, warmup iterations, thread control, and data gathering. It is a simple Maven-driven tool, requiring only a few commands. To start, add the code to be tested in a jar into your local maven repository ~/.m2/repository:
mvn install:install-file -DgroupId=com.infinitydb -DartifactId=infinitydb
-Dversion=4.1.0 -Dpackaging=jar -Dfile=../../infinitydb.jar
Then compile and prepare:
mvn clean install
and finally, run the test, for example using one fork, two measurement iterations per trial, 5 warmup iterations per trial, and 8 threads:
java -Xmx4g -jar target/benchmarks.jar -f 1 -i 2 -wi 5 -t 8
At the end of this command, you can include a regex that selects a subset of the test or ‘benchmark’ methods to run by method name. Creative method naming can make this very convenient.
Note that installing the framework from scratch requires an additional mvn command shown at OpenJDK, but the InfinityDB download has done this for you.
Annotation-Based Configuration
At the class level, you have control over all the default global parameters of the run, and these can be overridden at runtime:
Using a @State(Scope.Benchmark) allows each report line or "trial" to be parameterized by a different Map size, in which the parameter variable is static.
@Fork(1)
@Warmup(iterations = 2, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class InfinityDBJMHTest {
…
}
A "benchmark" corresponds to a test method, which examines a particular aspect of performance. The test methods are all Annotated with @Benchmark.
Next, we declare the static method that sets up the static variables that are configured on each trial using @Setup(Level.Trial). The trials can be repeated with parameterization using @Param on a static variable to run multiple trials per benchmark. We use this for spanning a range of scales logarithmically, also including 0.
The Simple "ItemSpace" Data Model
Because InfinityDB has a faster, more flexible lower-level "ItemSpace" API as well, we specifically test that in testPutAndRemoveItemSpace()
, where it is able to avoid a retrieval of the previous value. The "ItemSpace" model is an extremely simple one: a database is solely an ordered set of ‘Items," each Item being a character array from 0 to 1665 characters long. Unlike Strings, an Item is never constructed but contains a sequence of binary self-delimiting compressed strongly-typed primitive "components" in a standard format. Thus an "Item" corresponds to an encoded tuple of variable arity. Logically, an ItemSpace is a variable-arity ordered Tuple space. The ConcurrentNavigableMap wrapper exposes the Tuple space as nestable multi-Maps and Sets with composite keys and values. InfinityDB implements the ItemSpace using a B-Tree.
The Code
The code here tests InfinityDB as a ConcurrentNavigableMap with a benchmark method for each of get(), put(), remove(), iterators, forEach()
, and streams. The code and infinitydb.jar is in the InfinityDB trial download at https://boilerbay.com.
By declaring a static inner class LocalRandom with the @State(Scope.Thread) we can avoid sharing a Random between threads, which is very slow in critical code, creating a bottleneck. JMH will instantiate this for us, and provide it as a parameter to the test benchmark methods.
We return a long value in the benchmarks so that there is a ‘tangible’ result of the computation and the methods will not be optimized away - this may not be needed actually as there may be some way for JMH to handle this problem. JMH rewrites the benchmarks in a complex mysterious way and compiles that to minimize overhead and to do the instrumentation.
Below the code are the results.
//Copyright (C) 2018 Roger L. Deran, All rights reserved
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOError;
import java.io.IOException;
import java.io.PrintStream;
import java.nio.file.Files;
import java.util.Date;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Random;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.TimeUnit;
import java.util.function.BiConsumer;
import java.util.stream.LongStream;
import java.util.stream.Stream
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import com.infinitydb.map.db.InfinityDBMap;
package com.infinitydb;
/**
* Test the performance of InfinityDB using JMH.
*/
public class InfinityDBJMHTest {
// Maximum size in bytes. It increases as needed, then spills to disk.
// For this in-cache testing, we make it big.
static final long CACHE_SIZE = 100_000_000;
// Different database sizes
@Param({ "0", "1", "10", "100", "1000", "10000", "100000", "1000000" })
static long dbSize;
// The database itself. This presents an ItemSpace model
static InfinityDB db;
// Optional wrapper for the db that presents a ConcurrentNavigableMap model.
static ConcurrentMap<Long, Long> map;
static boolean isParallelStreams = System.getProperty("parallel") != null;
@Setup(Level.Trial)
static public void setup() throws IOException {
File infinityDBFile =
Files.createTempFile("InfinityDBJMHTest_", "").toFile();
infinityDBFile.deleteOnExit();
db = InfinityDB.create(infinityDBFile.toString(), true, CACHE_SIZE);
// Wrap the ItemSpace model to access as a standard
// ConcurrentNagivableMap.
map = new InfinityDBMap(db);
Random random = new Random(System.nanoTime());
// Load up the Map
for (long i = 0; i < dbSize; i++) {
long v = random.nextLong();
map.put(v, v);
}
// Show whether we had -Dparallel=true
if (isParallelStreams)
System.out.println("using parallel streams");
}
// Randomize between invocations.
// A Thread-local Random is fast.
@State(Scope.Thread)
public static class LocalRandom {
Random random = new Random(System.nanoTime());
}
/**
* Modify the database, adding a key/value association and removing it. We
* have to do the removes too, to keep the Map the same size.
*/
@Benchmark
public static long testPutAndRemove(LocalRandom localRandom) {
long k = localRandom.random.nextLong();
long v = localRandom.random.nextLong();
map.put(k, v);
map.remove(k);
return k;
}
/**
* Modify the database adding a key/value association and removing it.
*
* This uses the low-level 'ItemSpace' API that underlies the Map-Based API
* for speed. We have to do the deletes too, to keep the Map the same size.
* The Map-based API is slower because it has to retrieve the old value on
* each iteration to return.
*/
@Benchmark
public static long testPutAndRemoveItemSpace(LocalRandom localRandom)
throws IOException {
long k = localRandom.random.nextLong();
long v = localRandom.random.nextLong();
// Allocate a temporary cursor from an internal pool.
try (Cu cu = Cu.alloc()) {
/*
* Set the Cu cursor to contain the key and value. A Cu is a
* sequence of 0 to 1665 chars, like a StringBuffer, but binary.
* Each appended 'component' is a binary self-delimiting char
* sequence. Nothing is constructed or GC'ed.
*/
cu.clear().append(k).append(v);
// This is the equivalent of map.put(k, v). The Map wraps the db.
db.insert(cu);
// Have the Cu contain just the key component
cu.clear().append(k);
// Remove all Items starting with k
// This is the equivalent of map.remove(k);
db.deleteSubspace(cu);
}
return k;
}
@Benchmark
public static long testGet(LocalRandom localRandom) {
long k = localRandom.random.nextLong();
// v is almost always null.
Long v = map.get(k);
if (v != null)
System.out.println("v != null");
return v == null ? 0 : v.longValue();
}
@Benchmark
public static long testIterateKeySet() {
long sum = 0;
for (Long k : map.keySet()) {
sum += k.longValue();
}
return sum;
}
@Benchmark
public static long testIterateEntrySet() {
long sum = 0;
for (Entry<Long, Long> e : map.entrySet()) {
sum += e.getKey().longValue();
}
return sum;
}
@Benchmark
public static long testIterateValues() {
long sum = 0;
for (Long v : map.values()) {
sum += v.longValue();
}
return sum;
}
/**
* Multiply the ops/sec by Map size to get iterations/sec.
*/
@Benchmark
public static long testForEach() {
// the normal way
class SummingBiConsumer implements BiConsumer<Long, Long> {
long sum = 0;
public void accept(Long k, Long v) {
sum += v.longValue();
}
}
;
SummingBiConsumer summingBiConsumer = new SummingBiConsumer();
map.forEach(summingBiConsumer);
return summingBiConsumer.sum;
}
@Benchmark
public static long testStreams() {
if (true) {
Stream<Long> stream = map.keySet().stream();
// Stream<Long> stream = map.values().stream();
if (isParallelStreams)
stream = stream.parallel();
long sum = stream.reduce(0L, (x, y) -> x + y).longValue();
return sum;
} else {
// Use a LongStream for the reduce.
// This is apparently the best case for long streams.
LongStream stream = map.values().stream()
.mapToLong(v -> ((Long) v).longValue());
if (isParallelStreams)
stream = stream.parallel();
// The code for sum() is just a reduce, giving the same
// performance.
long sum = stream.sum();
// sum = stream.reduce(0L, (x, y) -> x + y);
return sum;
}
}
}
Here is the final summary of the run results on a 3GHz X86 quad-core (hence there are 8 virtual cores due to hyperthreading). An individual trial output is shown below it. All of the tests except the put()/remove()
are written to scan the entire database on each "operation" so the scores decrease in proportion to the database size. Just multiply dbSize by Score to get the true per-operation speed. Note that testPutAndRemove()
does both a put and remove for each iteration. ForEach()
is faster than iteration, as is true of most Maps.
As can be seen, Infinity DB generally reaches millions of operations per second. The effect of contention on individual blocks can be seen in the testPutAndRemove, which shows a performance jump between 1K and 10K Entries, when the database becomes multiple blocks in the cache.
# Run complete. Total time: 00:13:43
Benchmark (dbSize) Mode Cnt Score Error Units
InfinityDBJMHTest.testForEach 0 thrpt 2 368667.732 ops/s
InfinityDBJMHTest.testForEach 1 thrpt 2 634144.250 ops/s
InfinityDBJMHTest.testForEach 10 thrpt 2 214008.986 ops/s
InfinityDBJMHTest.testForEach 100 thrpt 2 17423.518 ops/s
InfinityDBJMHTest.testForEach 1000 thrpt 2 2020.207 ops/s
InfinityDBJMHTest.testForEach 10000 thrpt 2 677.259 ops/s
InfinityDBJMHTest.testForEach 100000 thrpt 2 45.526 ops/s
InfinityDBJMHTest.testForEach 1000000 thrpt 2 3.717 ops/s
InfinityDBJMHTest.testGet 0 thrpt 2 1956460.272 ops/s
InfinityDBJMHTest.testGet 1 thrpt 2 1809225.246 ops/s
InfinityDBJMHTest.testGet 10 thrpt 2 1712550.418 ops/s
InfinityDBJMHTest.testGet 100 thrpt 2 1102420.633 ops/s
InfinityDBJMHTest.testGet 1000 thrpt 2 1207971.899 ops/s
InfinityDBJMHTest.testGet 10000 thrpt 2 3266049.588 ops/s
InfinityDBJMHTest.testGet 100000 thrpt 2 2648304.018 ops/s
InfinityDBJMHTest.testGet 1000000 thrpt 2 2726656.997 ops/s
InfinityDBJMHTest.testIterateEntrySet 0 thrpt 2 367123.806 ops/s
InfinityDBJMHTest.testIterateEntrySet 1 thrpt 2 484646.862 ops/s
InfinityDBJMHTest.testIterateEntrySet 10 thrpt 2 121755.481 ops/s
InfinityDBJMHTest.testIterateEntrySet 100 thrpt 2 11345.324 ops/s
InfinityDBJMHTest.testIterateEntrySet 1000 thrpt 2 1374.924 ops/s
InfinityDBJMHTest.testIterateEntrySet 10000 thrpt 2 414.319 ops/s
InfinityDBJMHTest.testIterateEntrySet 100000 thrpt 2 32.797 ops/s
InfinityDBJMHTest.testIterateEntrySet 1000000 thrpt 2 2.810 ops/s
InfinityDBJMHTest.testIterateKeySet 0 thrpt 2 371764.136 ops/s
InfinityDBJMHTest.testIterateKeySet 1 thrpt 2 457224.769 ops/s
InfinityDBJMHTest.testIterateKeySet 10 thrpt 2 120137.571 ops/s
InfinityDBJMHTest.testIterateKeySet 100 thrpt 2 10874.475 ops/s
InfinityDBJMHTest.testIterateKeySet 1000 thrpt 2 1383.806 ops/s
InfinityDBJMHTest.testIterateKeySet 10000 thrpt 2 441.785 ops/s
InfinityDBJMHTest.testIterateKeySet 100000 thrpt 2 35.322 ops/s
InfinityDBJMHTest.testIterateKeySet 1000000 thrpt 2 2.937 ops/s
InfinityDBJMHTest.testIterateValues 0 thrpt 2 374969.317 ops/s
InfinityDBJMHTest.testIterateValues 1 thrpt 2 479385.263 ops/s
InfinityDBJMHTest.testIterateValues 10 thrpt 2 119844.074 ops/s
InfinityDBJMHTest.testIterateValues 100 thrpt 2 10432.263 ops/s
InfinityDBJMHTest.testIterateValues 1000 thrpt 2 1452.275 ops/s
InfinityDBJMHTest.testIterateValues 10000 thrpt 2 421.415 ops/s
InfinityDBJMHTest.testIterateValues 100000 thrpt 2 33.711 ops/s
InfinityDBJMHTest.testIterateValues 1000000 thrpt 2 2.835 ops/s
InfinityDBJMHTest.testPutAndRemove 0 thrpt 2 210620.615 ops/s
InfinityDBJMHTest.testPutAndRemove 1 thrpt 2 201967.304 ops/s
InfinityDBJMHTest.testPutAndRemove 10 thrpt 2 198272.620 ops/s
InfinityDBJMHTest.testPutAndRemove 100 thrpt 2 147114.722 ops/s
InfinityDBJMHTest.testPutAndRemove 1000 thrpt 2 195863.534 ops/s
InfinityDBJMHTest.testPutAndRemove 10000 thrpt 2 488445.748 ops/s
InfinityDBJMHTest.testPutAndRemove 100000 thrpt 2 523575.919 ops/s
InfinityDBJMHTest.testPutAndRemove 1000000 thrpt 2 496160.074 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 0 thrpt 2 415157.833 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 1 thrpt 2 427640.348 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 10 thrpt 2 401591.174 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 100 thrpt 2 309627.808 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 1000 thrpt 2 383137.626 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 10000 thrpt 2 899923.410 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 100000 thrpt 2 939347.803 ops/s
InfinityDBJMHTest.testPutAndRemoveItemSpace 1000000 thrpt 2 999958.337 ops/s
InfinityDBJMHTest.testStreams 0 thrpt 2 187775.359 ops/s
InfinityDBJMHTest.testStreams 1 thrpt 2 239705.873 ops/s
InfinityDBJMHTest.testStreams 10 thrpt 2 74142.603 ops/s
InfinityDBJMHTest.testStreams 100 thrpt 2 6298.659 ops/s
InfinityDBJMHTest.testStreams 1000 thrpt 2 854.418 ops/s
InfinityDBJMHTest.testStreams 10000 thrpt 2 247.787 ops/s
InfinityDBJMHTest.testStreams 100000 thrpt 2 20.221 ops/s
InfinityDBJMHTest.testStreams 1000000 thrpt 2 1.586 ops/s
Here is the output of one trial of one benchmark — testStreams over a 1M Entry database. Because the test iterates the entire database on each ‘operation’, the streams are scanning at 1.5M/sec.
# JMH 1.13 (released 543 days ago, please consider updating!)
# VM version: JDK 1.8.0_131, VM 25.131-b11
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_131.jdk/Contents/Home/jre/bin/java
# VM options: -Xmx4g
# Warmup: 5 iterations, 1 s each
# Measurement: 2 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 8 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.infinitydb.InfinityDBJMHTest.testStreams
# Parameters: (dbSize = 1000000)
# Run progress: 98.44% complete, ETA 00:00:11
# Fork: 1 of 1
# Warmup Iteration 1: 1.633 ops/s
# Warmup Iteration 2: 1.612 ops/s
# Warmup Iteration 3: 1.621 ops/s
# Warmup Iteration 4: 1.627 ops/s
# Warmup Iteration 5: 1.609 ops/s
Iteration 1: 1.607 ops/s
Iteration 2: 1.566 ops/s
Result "testStreams":
1.586 ops/s
Opinions expressed by DZone contributors are their own.
Comments