How to Utilize Java Benchmarks With Arm Processors
Join the DZone community and get the full member experience.Join For Free
I have seen a lot of speculation surrounding ARM processors, specifically after Apple announced its plan to change over to Arm-based processors. Many people assume that the performance will be similar to a Raspberry Pi, however, this is incorrect. While Java on ARM is not uncommon, there has been a recent spike due to increased ARM investments from cloUd vendors. Amazon and Microsoft have taken steps towards this, with Amazon updating its ARM offerings, and Microsoft porting the JVM to Arm64 for Windows, which will be helpful for future Azure support.
In this article, I will show the Java benchmarks I took on different AWS EC2 instances, and for fun on my laptop.
Amazon a1.large (ARMv8 Cortex-A72, 2 Cores, 4GB RAM).
Amazon m6g.medium (ARMv8 Neoverse-N1, 1 Core, 4GB RAM).
Amazon t3.medium (Intel Xeon Platinum 8259CL, 1 Core / 2 Threads, 4GB RAM).
Apple MacBook Pro (Intel i9 2.4GHz, 8 Core / 16 Threads, 64GB RAM).
The Arm trademark will be referred to as such. Although it was previously written “ARM”, it is now “Arm”.
A Note About Benchmarks
Benchmarks are just numbers. They serve as a starting point when you are figuring out the compute power you need for your own application. All applications are different, your workload will likely have different characteristics than these benchmarks. The only way to figure out how your application will perform on a different system is to try to test it out!
For these tests, I tried to compare three different AWS offerings that are similar and have comparable on-demand pricing. There are some differences though. The a1.large instance is from Amazon’s first-generation ARM processors, whereas the m6g.medium is the current ARM series, and the t3.medium is an Intel x86_64 processor.
To keep things consistent, all of these benchmarks used Amazon Corretto 220.127.116.11.1 JVM, with the default GC configuration, and with the tests run through Phoronix Test Suite.
In almost all cases the a1.large instance performed the worst, and my MacBook the best. This isn’t particularly interesting, so I’m going to focus my analysis on the differences between the t3.medium and the m6g.medium instances.
First up, we have a test based on Apache Spark’s MLlib which uses a random forest algorithm.
In this test, the t3.medium is 15% faster than the m6g.medium.
The Spark alternating least squares (ALS) benchmark is one of the few tests where both the Arm servers outpaced the t3.medium.
In the Spark Naive Bayes algorithm test, the m6g.medium was 8% faster than the t3.
Winner: m6g.medium. This was almost too close to call, but in the words of Meat Loaf, “Two Out of Three Ain’t Bad.”
This batch of benchmarks focuses on compute-heavy operations. The first two use functional “actors” programing from the Savina Actors Benchmark Suite, and the rest focus on math-based operations.
This first test shows the t3.medium has a 20% lead over the m6g.medium.
Interestingly, in this test, both the m6g and the t3 perform about the same.
The Spark PageRank test shows the m6g performs 65% faster than the t3.
The m6g also outperformed the t3 by 22% when calculating Fourier transforms.
The t3.medium narrowly wins the sparse matrix multiplication tests by 3%.
Threads and Concurrency
For many of us building web applications, concurrency is critical as your web server handles many different requests at once. This set of tests highlights the differences between the number of vCPUs in each system—the m6g.medium only has one, the a1.large and the t3.medium both have two, and the MacBook Pro has 16.
This first test uses two threads, so naturally, the t3 is about 34% faster than the m6g.
The Twitter HTTP Finagle test starts a small HTTP server and creates a number of clients equal to the number of vCPU cores plus one. The HTTP server has the number of CPUs*2. This is going to create a bit of thread contention, which likely explains these results.
Winner: t3.medium (This one was not a fair fight.)
At the end of the day, which system you pick may come down to a balance of price and performance. The on-demand pricing for an m6g instance (medium, large, xlarge, 2xlarge) was about 8.5% cheaper than the corresponding t3 instance.
The overall winner of these benchmarks is my MacBook Pro! Joking aside, the difference between Amazon’s second-generation Arm processors and the equivalent Intel processor wasn’t what I expected when I started writing this post. If I had to pick between t3.medium and the m6g.medium, I’d say the overall winner of this showdown is the Arm m6g.medium.
As I mentioned at the start of this post, all of this info needs to be taken with a grain of salt. Your Java applications will perform differently than these benchmarks, you will need to make your own conclusion to figure out if switching to Arm is right for you. The biggest challenge in switching from x86_64 to Arm64 is making sure your native dependencies are available—but this is much less of an issue nowadays as both Java and Linux distros have been supporting Arm for years.
If you enjoyed this blog post and want to see more like it, follow @oktadev on Twitter, subscribe to our YouTube channel, or follow us on LinkedIn. As always, please leave your questions and comments below—we love to hear from you!
Published at DZone with permission of Brian Demers, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.