Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.
Having spoken with many customers evaluating our product
I am noticing that a majority of folks evaluating in-memory computing,
whether it be data grid, map reduce, or streaming, do not know how to
appropriately perform benchmarking. The right approach to distributed
in-memory benchmarking is very different than benchmarking disk-based
products, like databases, and generally requires experience and
understanding of the delicate details of how network and garbage
collections behave under load. With that in mind, GridGain
will soon be releasing a benchmarking framework to help easily overcome
all these newbie issues, but until then here is a laundry list of
things to watch out for.
1. Did you allocate enough memory?
If you have not allocated enough memory, the performance of your
application may significantly degrade. From a user's stand point it may
not really be noticeable until your application gradually becomes slower
and slower, which is the first sign that you may be running out of
Also keep in mind that if you allocate more than 10GB of memory per JVM,
your application may start suffering from prolonged garbage
collections. To mitigate that, check if the product you are evaluating
has support for off-heap memory
which cheats Java garbage collection by utilizing off-heap memory space.
When running benchmarks, you should constantly monitor memory consumption. You can do it with any light weight profiler, like VisualVm which is shipped by default with JDK.
2. Are your tests multi-threaded?
Benchmarking should not be performed from a single thread or a small
number of threads, as generally such tests will not load the system.
This is especially true for synchronous API calls, as the next API call
has to wait for the previous one to complete. It is generally a good
practice to run benchmarks from about 60 to 100 threads (on boxes with
more than 16 cores, the number of threads can be higher).
When running benchmarks, you should monitor the CPU load on your system and add more threads if system is below 90% load.
3. Do you perform initial warm-up?
JVM HotSpot compiler will usually inline and precompile portions of the
code that get executed the most. However, usually in load tests, this
warm-up process takes about 20 or 30 seconds. It is generally a good
practice to allow benchmarks to warm up for about 30 seconds before you
start measuring performance.
When running benchmarks, you should do periodic print outs of your
throughput (number of operations per second). The warm-up is usually
finished when the throughput numbers stabilize.
4. Did you tune your garbage collection?
If you are seeing spikes in your throughput, it may be due to JVM
Garbage Collection (GC). In this case it is best to use concurrent sweep
GC which has proven to provide fairly smooth throughput without any
large spikes or long pauses.
Here is a good link, which describes several GC tuning parameters: Tune Garbage Collection.
5. Do you use bulk operations?
Often many developers will sequentially execute multiple single cache 'put(key, value)' operations
and then notice that performance is not ideal. The main reason is that
whenever network I/O is involved, you should always strive to send as
few messages as possible. To achieve this, a majority of data grid or
caching products provide support for bulk operations, such as 'putAll(...)'. Bulk updates can often improve performance by 100x magnitude.
When using bulk updates, make sure to experiment with batch sizes -
making them too big or too small can hurt performance. Also see if the
product you are evaluating has already automated bulk-updates for you.
Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now. Brought to you in partnership with Hortonworks.