The best way to test an infrastructure before going into production is to mimic production load and solve problems that arise. One of the main challenges with this approach is having a load generator that can provide both rate and message size as close to production as possible. Some companies have the luxury of being able to route a percentage of the current production load into the infrastructure/technology being tested and work on setting it up. But there are many others that are building the infrastructure without prior live production experience with that product and this is where most of them fail to prepare for production. There is a plethora of load testing tools and even some of the technologies have some kind of load or stress test tool shipped with it but most of them don’t.
Load testing Kafka today is fairly easy with a tool (shipped with Kafka) called ProducerPerformance. This tool provides you with the ability to test producing a variable message size at fixed rate and test the throughput of your Kafka cluster. I find it really useful for initial quick testing, but for any in-depth tuning or production readiness, we need something better. Sangrenel is somewhat outdated but it can still throw a punch at your Kafka cluster. Gatling, being one of the widely used load generators (stress tests), has Kafka producer plugins. It does provide scalability and it’s a battle-tested tool, but it has the same issue as all others available: stress testing is not production-like testing. The main problem when switching from lab environments into production is that production has context data versus the stress tests that have randomly generated data.
In order to get over this issue, we have built a context data generator called Ranger and a load generator called Berserker. The most important part of Ranger is data generator that generates contextual data from a configuration or collects it from a data source. Using a simple configuration (for now) we can run a load test with data that creates a production-like scenario and makes sense for our use case. Different message types and sizes allow a better understanding of how Kafka will behave in production and we can prepare for what we are actually going to see.
Load testing Kafka is most of the time limited by the producer’s NIC throughput, so to really stress test your Kafka cluster, you need multiple load generator instances. Deploying multiple instances on the same box doesn’t help because of the NIC limitations. We are currently working on a feature to enable effortless multi-instance deployments and orchestration to be able to fully saturate the Kafka cluster and test its capabilities. When going into production, it’s extremely useful to understand the capabilities and limits of your setup.
With Ranger, we have the ability to generate semi-random or contextual data that’s described by a schema or use any kind of database or file storage to retrieve data from. One of the main advantages of Ranger is that we can define a certain percentage of data that will have different values. This is really useful when testing corner cases of the whole infrastructure — but we can also mimic the anomalies when generating measurement data.
This is Berserker’s high-level architecture:
While both projects are still being heavily developed, we see a huge value in generating data that makes sense — not just random data, but we can also leverage any of the existing storages to either replay or transform the data and create load. With pluggable architecture, Berserker can be used to consume any data source and target any endpoint.
We are preparing a set of blog posts on this subject with getting into details with Ranger and Berserker, and we will provide test results from our Kafka cluster, so stay tuned...