SSD Performance Tips - Checked on Apache Ignite
SSD Performance Tips - Checked on Apache Ignite
Read on for details of how the cache coherence protocol can help you use Apache Ignite to increase the speed and performance of your platform.
Join the DZone community and get the full member experience.Join For Free
SignalFx is the only real-time cloud monitoring platform for infrastructure, microservices, and applications. The platform collects metrics and traces across every component in your cloud environment, replacing traditional point tools with a single integrated solution that works across the stack.
As a software guy, I was always curious to know how things work at the hardware level and how to apply that knowledge for more advanced optimizations in applications. Take the Java Memory Model, for instance. The model grounds its memory consistency and visibility properties on keywords such as volatile or synchronize. But these are just the language keywords, and you start looking around at how JVM engineers could turn the model in life. At some point, you will breathe out, revealing that the model utilizes a low-level instruction set for mutexes and memory barriers at the very bottom of the software pie running on physical machines. Nice- these are the instructions a CPU understands, but the curiosity drives you further because it is still vague how all the memory consistency guarantees can be satisfied on multi-CPU machines with several CPU registers and caches. Well, the hardware guys took care of this by supporting the cache coherence protocol. And finally you, as a software guy, can develop highly-performant applications that halt CPUs and invalidate their caches only on purpose with all these volatile, synchronize, and final keywords.
Apache Ignite veterans tapped into the knowledge above and, undoubtedly, could deliver one of the fastest in-memory database and computational platform. Presently, the same people are optimizing Ignite Native Persistence - Ignite's distributed and transactional persistence layer. Being a part of that community, let me share some tips about solid-state drives (SSDs) that you, as a software guy, can exploit in Ignite or other disk-based databases deployments.
SSD Level Garbage Collection
The term garbage collection (GC) is used not only by Java developers to describe the process of purging dead objects from Java heap residing in RAM. Hardware guys use the same term for the same purpose but in relation to SSDs.
In simple words, an SSD stores data in pages. Pages are grouped in blocks (usually 128/256 pages per block). The SSD driver can write data directly into an empty page but can clean the whole blocks only. Thus, to reclaim the space occupied by invalid data, all the valid data from one block has to be first copied into empty pages of another block. Once this happens, the driver will purge all the data from the first block giving more space for new data arriving from your applications.
This process happens in the background and called with a familiar term - garbage collection (GC).
So, if you suddenly observe a performance drop under a steady load like it's shown in Figure 1 below, do not be trapped blaming your application or Apache Ignite. The drop might be caused by SSD GC routines.
Let me give you several hints on how to decrease the impact of the SSD GC on the performance of applications.
Separate Disk Devices for WAL and Data/Index Files
Apache Ignite arranges data and indexes in special partition files on disk. This type of architecture does not require you to have all the data in RAM; if something is missing there, Apache Ignite will find the data on disk in these files.
However, referring to Figure 2., every data (1) that is received by Apache Ignite cluster node will be stored in RAM and persisted (2) in a write-ahead log (WAL) first. This is done by performance reasons and once the update is in the WAL, your application will get the acknowledgment and be able to execute its logic. Then, in the background, the checkpointing process will update the partition files by copying dirty pages from RAM to disk (4). Specific WAL files will be archived over the time and can be safely removed because all the data will be already in the partition files.
So, what's the performance hint here? Consider using separate SSDs for the partition files and the WAL. Apache Ignite actively writes to both places, thus, by having separate physical disk devices for each you may double the overall write throughput. See how to tweak the configuration for that.
As the Java heap, SSD requires free space to perform efficiently and to avoid significant performance drops due to the GC. All SSD manufactures reserve some amount of space for that purpose. This is called over-provisioning.
Here are you, as a software guy, should keep in mind that the performance of random writes on a 50% filled disk is much better than on a 90% filled disk because of the SSDs over-provisioning and GC. Consider buying SSDs with higher over-provisioning rate and make sure a manufacturer supports the tools to adjust it.
That's enough for the beginning. If you are a sort of the guy who wants to get most of the hardware by tweaking page size or swapping settings, refer to this tuning page maintained by Apache Ignite community.
Published at DZone with permission of Denis Magda . See the original article here.
Opinions expressed by DZone contributors are their own.