Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Benchmarking Kafka Write Throughput Performance — 2019 Update

DZone 's Guide to

Benchmarking Kafka Write Throughput Performance — 2019 Update

Check out this post to learn more about performance benchmarking with Kafka.

· Performance Zone ·
Free Resource

Back in 2017, we published a performance benchmark to showcase the vast volumes of events Apache Kafka can process. Natural to Aiven services, we evaluated the performance across the public cloud providers we supported.

A number of changes have been made to Kafka and the resources that the public cloud providers offer since then. As such, we felt it was due time for a refresher and decided to remeasure write performance in this test.

That said, we made some small changes to the benchmark set up so that it better reflected real-world workloads. We also calculated the monthly throughput cost for each plan, on each cloud this time. So, let’s jump in!

2019 Aiven Kafka Benchmark Setup

As with the previous test, we really wanted to estimate the true performance you’d expect from using Aiven Kafka. That is, we used standard Aiven plans and configurations, standard client configurations, and ran the test over the external network interfaces.

But this time around, we’re using a replication factor of three to match the regular use case. With replication, this test accounts for the network traffic between the brokers as well.

We used a single topic for our write operations with a partition count set to either 3 or 6, depending on the number of brokers in each test cluster. As the test clusters were regular Aiven services, the partitions and replicas were spread out across availability zones.

Messages were produced via the librdkafka_performance tool with a message size of 512 bytes, a default batch size of 10,000, and no compression. Continuing our quest to simulate real-world use, client connections were made over TLS.

We used Kafka version 2.1 running with Java 8; as a side note, it’ll be interesting to benchmark Aiven Kafka running with Java 11 in future tests because we expect Java improvements to positively impact its performance.

During the test, we kept increasing the number of producing clients until we reached the maximum throughput rate each plan tier’s cluster could accept. To verify our readings, we left the load running for some time.

If you’re interested in verifying our results, you can get the test code here. In our tests, we actually used Google’s managed Kubernetes service to easily scale the number of loads generating nodes up and down.

Aiven Kafka Business-4 Benchmark Results

We first tested the performance of our Business-4 plan. That’s a three broker cluster with 1-2 CPU (depending on the cloud) and 4GB RAM per instance. On Amazon Web Services, this plan handled about 135,000 messages per second while the same plan on Google Cloud Platform and Azure handled around 70,000.

2019 aiven kafka business 4 message throughput per second in aws, gcp, and azure image

Since our previous test omitted replication, the somewhat lower performance of GCP and Azure can be explained by this test’s inclusion of it. Surprisingly, AWS’s performance jumped from 50,000 messages/second in the previous test to this number. This is explained by the more recent instance types and network improvements AWS has been fielding in their cloud.

Aiven Kafka Business-4 Performance in MB/Second

2019 aiven kafka business 4 megabyte throughput per second in aws, gcp, and azure image

We then used the message rates to derive throughput numbers, which were over 65 MB/second for AWS, and just under 35 MB/second for GCP and Azure. Pretty impressive! But, what is the cost per performance?

Business-4 Monthly Throughput Cost

Those plans are priced $660/month on AWS (us-east-1), $500/month on GCP (us-east1) and $550/month on Azure (eastus2). That’d be just over $10 per MB/s per month for this plan size in AWS, and $15 and $16 per MB/s for GCP and Azure respectively.

2019 aiven kafka business 4 monthly throughput cost in aws, gcp and azure image

Aiven Business-8 Benchmark Results

Next, we moved on to increase the size of the brokers. The next test was based on Business-8 plan tier, essentially doubling the resources to 2-4 CPUs and 8 GB RAM per instance. This time, there was a slight increase in AWS to 137,000 messages per second, but larger ones in Azure and GCP to 120,000 and 95,000 respectively.

2019 aiven kafka business 8 message throughput per second in aws, gcp, and azure

AWS performance didn’t move from the previous plan sizes. A look into our monitoring revealed that both tests were capped by the available network bandwidth on the broker instances.

Aiven Kafka Business-8 Performance in MB/Second

Again, we used these same numbers converted to throughput: that’s 67 MB/s for AWS, 46 MB/s for GCP, and 59 MB/s for Azure.

2019 aiven kafka business 8 megabyte throughput per second in aws, gcp, and azure

Business-8 Monthly Throughput Cost

With the Business-8 plans, monthly estimated costs increased across the board at $19 per MB/s for AWS and Azure and around $22 per MB/s for GCP. However, it’s important to note that throughput is only one way to measure value. For example, Business-8 plans come with double the storage of Business-4 plans which allows for longer retention times.

2019 aiven kafka business 8 monthly throughput cost in aws, gcp and azure image

Aiven Kafka Premium-6x-8 Benchmark Results

Our last test doubled the number of brokers. We wanted to verify just how well Kafka scales vertically — as it did quite perfectly in our previous round of testing. Thus, we ran this one with a six broker Premium-6x-8 plan tier, with similarly sized instances as Business-8.

And the message rates? An impressive 270,000 messages per second on AWS, 238,000 on Azure and 167,000 on GCP; well in line with the expected results.

2019 aiven kafka premium 6x-8 message throughput per second in aws, gcp, and azure

Aiven Kafka Premium-6x-8 Performance in MB/Second

And the same as throughput figures: 132 MB/s on AWS, 116 on Azure, and 82 on GCP.

2019 aiven kafka premium 6x-8 megabyte throughput per second in aws, gcp, and azure

Premium-6x-8 Monthly Throughput Cost

From the plan pricing, estimated monthly costs are around $19 per MB/s for AWS, $18 for Azure, and $23 for GCP.

2019 aiven kafka premium 6x-8 monthly throughput cost in aws, gcp and azure image

Wrapping Up

Apache Kafka continues to perform just as well as we’ve come to expect and scales nicely with both added resources and increased cluster sizes. It’s performant, scalable, and cost-effective — a solid centerpiece of the modern data architecture.

Again, we’d like to stress that monthly throughput cost should not be considered in isolation when comparing plans. Although important, there are additional factors that come into play when pricing plans, such as storage.

Additionally, we can’t stress enough that workloads vary and you should definitely benchmark your own representative event flows. For a more robust test, we’ll be addressing read/write tests in the near future.

Topics:
apache kafka ,amazon web services ,google cloud ,microsoft azure ,benchmarks ,performance ,benchmarking ,cloud

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}