Capacity planning in the enterprise is no easy task. In this post, we provide an overview for sizing VMware’s elastic, in-memory data management product, vFabric GemFire and a link to an in-depth, technical article.
Setting the Stage for Memory Sizing
Enterprise applications today are distributed systems that have to satisfy increasingly more complex business requirements. When the ever growing demand for managing more data is added, the task keeps getting harder.
One of the key factors in capacity planning for memory intensive systems, such as in- memory data stores, is memory capacity. Even though the price of memory keeps going down, data capacity requirements keep going up, and this makes memory as precious a resource as ever. As large systems become even larger, it becomes more important to manage this resource efficiently. In addition to obvious reasons, such as Total Cost of Ownership (TCO), there are technical challenges that come with large memory pools. For one, garbage collection (GC) takes more time, which can affect both latency and throughput. Determining memory requirements correctly is both crucial and difficult.
That is why this post and the related technical article focus on memory sizing and provide concrete guidelines for determining required memory for optimal performance, especially in large scale vFabric GemFire deployments. GemFire has facilities that can be very useful for memory sizing. The article not only explains the facilities, but also describes a method and guidelines to take the guesswork out of memory sizing process.
Here is a brief overview of the technical article and it’s four main sections:
A. The Process: This section covers the primary process for sizing and capacity planning including estimating, testing, and adjusting the memory requirements in an iterative fashion:
• Data Sizing
• JVM and Application Sizing
• Scale-out Testing and Iteration
B. Data Sizing: This section provides a calculator/utility and explains how to use GemFire Statistics and Visual Statistics Display as well as the Heap histogram to understand actual memory use. GemFire partitions over a number of data nodes, and the article explains how GemFire’s partitioned regions consist of partitions, and partitions consist of buckets. The partition is what fits in a single JVM, and a bucket is the smallest unit of data that can be moved across JVMs during rebalancing. So, calculating buckets is explained and is one of the most important steps.
C. Sizing the JVM Heap: This section will show you how to calculate the storage in a single JVM and figure out the number JVMs that are needed. It also explains GC pauses, compressed oops, JVM headroom and heap size, data serialization, and application overhead.
D. Scale-out Testing: An overview of the steps to scale-out testing is provided with a single JVM as the unit of scale.