DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Coding
  3. Java
  4. Micro-Benchmarking in Java - Revisited

Micro-Benchmarking in Java - Revisited

Jakub Kubrynski user avatar by
Jakub Kubrynski
·
Aug. 01, 12 · Interview
Like (1)
Save
Tweet
Share
10.40K Views

Join the DZone community and get the full member experience.

Join For Free

Lots of us faced with some performance challenge encountered the problem with the right measuring time of some method invocation, loop iteration, etc. The first solution that comes to mind is to use System.nanoTime() before and after execution. And after subtraction we will get our result. But are we sure that the simplest answer is also the best one? What about comparing multiple measurements? What about JIT impact?


Fortunately, a new king is in town - the Caliper. It solves all these problems and protects us from committing other standard errors. It’s not my idea to write another tutorial (those available in Caliper’s source repository are good enough to start using this framework). Instead, let's go into details of how to interpret obtained results and choose the right parameters to perform benchmark.


The first important thing that Caliper does is it creates full scenarios cartesian using:

  • defined virtual machines
  • benchmark methods (for time measurements it should be public and starts with “time”)
  • user parameters (fields annotated by @Param)

Then each element of the set Caliper starts a new JVM - that’s the way it solves the problem of impacting test execution order on results.

 

The second thing is a warm-up. There are two reasons of doing it:

  • allow JIT to make all optimizations before starting appropriate measurement
  • estimate time needed to execute single test-method iteration
We can define warm-up time by specifying the --warmupMillis parameter.

 

Another important start parameter is --runMillis. But it’s very important to understand how Caliper treats “run” in this case. At first, the Caliper makes three cycles of measurement, with the following different duration scales:

  • first takes 1.0 * runMillis
  • second takes 0.5 * runMillis
  • third takes 1.5 * runMillis
After these cycles, a framework verifies the quality of the results using a simple rule: The standard deviation has to be less than 1% of the average. While this tolerance is not fulfilled, Caliper can perform up to 7 additional measurements to achieve an acceptable threshold. That’s the answer why 2000ms of warm-up, plus 5000ms of fixed run time is equal to 30 seconds of total test time :)

 When the test runs are finished it’s time to interpret the outcome. Omitting some obvious output we get this:

21042,13 ns; σ=568,81 ns @ 10 trials

The first element is the median of results (here 21042.13 ns). The second part is the standard deviation (568 ns). And finally, after ape we have a number of performed test runs (here 10 => base 3 runs + 7 additional because of poor distribution). We should know that these trails have nothing in common with those defined in --trails parameter. The latter can be compared to the number of full suite runs.

Last but not least, and very useful, especially for determining the proper times of phases execution, is the combination of --captureVmLog and --saveResults filename.json parameters. These results in the creation of the filename.json file contain raw measurements and (what’s the most important) JIT compilation messages that are applied to different phases of test execution (warm-up run and proper run).

I hope that this article will be useful and will save some time when investigating how Caliper works.

Some useful links:

Caliper's home page: https://code.google.com/p/caliper/

Examples: http://code.google.com/p/caliper/source/browse/examples/src/main/java/examples/ 

 

Java (programming language)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 10 Secure Coding Practices Every Developer Should Know
  • 5 Factors When Selecting a Database
  • Load Balancing Pattern
  • Better Performance and Security by Monitoring Logs, Metrics, and More

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: