DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • Database Integration Tests With Spring Boot and Testcontainers
  • Send Email Using Spring Boot (SMTP Integration)
  • Integration Testing Tutorial: A Comprehensive Guide With Examples And Best Practices
  • 4 Expert Tips for High Availability and Disaster Recovery of Your Cloud Deployment

Trending

  • Database Integration Tests With Spring Boot and Testcontainers
  • Send Email Using Spring Boot (SMTP Integration)
  • Integration Testing Tutorial: A Comprehensive Guide With Examples And Best Practices
  • 4 Expert Tips for High Availability and Disaster Recovery of Your Cloud Deployment
  1. DZone
  2. Data Engineering
  3. Databases
  4. In-memory database sizing – throw out conventional wisdom

In-memory database sizing – throw out conventional wisdom

John Piekos user avatar by
John Piekos
·
Jun. 05, 14 · Interview
Like (0)
Save
Tweet
Share
1.71K Views

Join the DZone community and get the full member experience.

Join For Free

Sizing an in-memory database does not follow conventional database sizing rules.

For traditional databases, you buy a decent server machine, likely one with many CPU cores and reasonable memory, and then focus on application IOPS (I/O Operations per Second). If you are really going to stress the database, you must choose disks that can support the I/O needs of your application, today and in the future. Because these systems often use many disks to achieve high I/O performance, capacity is usually an afterthought.

With in-memory databases, throw out everything you know about sizing databases. Performance is now the afterthought, but capacity must be carefully considered.

To size an in-memory clustered database application you must consider the following factors:

    • First, compute the size of the data your application expects to host. This means computing the size of all rows in all tables. And also computing the size of all table indexes. VoltDB provides an easy-to-use interactive sizing tool in our application catalog report, described later in this post.
    • If you want a highly available database cluster, all data must be stored redundantly. You’ll need to multiply your logical data size by the replication factor to understand your memory requirements. For example, if you have 100GB of table and index data, and you want your database to be able to withstand losing one of the nodes in the cluster, you will want each datum to be stored twice, and on two different physical machines. This means that you need to size the data requirement of your database to be 200GB (100GB of data X 2 copies of the data).
    • Allow for some extra memory, perhaps an additional 25%, for database and operating system overhead. VoltDB uses extra memory for storing intermediate results, maintaining an undo log, caching SQL plans and buffering network data. Note that on a machine with 4GB, you’ll want to leave a larger fraction of memory free than on a machine with 256GB.

This tallied memory total can now help you decide how many machines your system requires. Note at no time did we need to determine how much CPU we required, or how many IOPS were needed. Because you are sizing your database based on data, memory size is the most important calculation you will likely use. Generally speaking, in-memory databases like VoltDB operate so fast that CPU utilization is not a problem – there is plenty of headroom for growth. As for disk I/O, most in-memory systems use sequential logs on disk, eschewing costly random I/O. With this reduced dependency on disk performance, VoltDB can achieve tremendous throughput, even on commodity spinning disks.

Often, you’ll be faced with a choice between many machines with less memory per machine or a small number of high-memory machines. There are lots of reasons to lean one way or the other. For example, you should try to run with at least three nodes if availability or redundancy is a concern. Additionally, some administrator operations run faster with less per-machine data. At some point, however, too many nodes can be a management hassle. Most VoltDB users run with 3 to 30 nodes per cluster for these reasons.

Planning for Data Capacity and Growth

Sizing VoltDB databases is fairly straightforward. Once you have settled on your application’s database schema, point your web browser to the database Catalog Report and choose the “Size Worksheet”, found at this URL: http://localhost:8080/#z (note the database must be running).

Estimate Memory Used - 6-2-14

This sizing worksheet is automatically built from your application’s schema. It allows you to enter your expected number of rows for each table and then computes the amount of memory required for the database cluster as a whole.

Size Analysis Summary - 6-2-14

Note that the footprint of your database is the size of the number of rows as well as the size of all indexes and materialized views. You must also leave additional memory for the operating system. Note that this assumes VoltDB is the only active application on these machines. If you plan on sharing the machines with other applications (something we do not recommend), your memory requirements will naturally be larger.

Expanding your Database Capacity

A well-designed IT environment will monitor the performance and capacity of the applications running the business.

Because VoltDB processes transactions incredibly fast, it is often the case that there is plenty of transaction throughput headroom, and lots of extra CPU cycles, available. Monitoring the memory usage of the database cluster becomes much more important. VoltDB provides monitoring interfaces, as well as Nagios and New Relic plug-ins, that make monitoring memory usage easy. These tools allow you to define capacity alerts, in much the same way you would define disk space usage alerts for legacy database systems. Should your in-memory database memory usage grow to a high level, signaling a capacity limit alert, it may be time to add more storage capacity to your database.

With VoltDB, adding capacity is easy: simply add more nodes. VoltDB supports adding nodes to a running cluster without interrupting ongoing operations or sacrificing ACID transactional guarantees.

If you do not wish to add more nodes to your database cluster, doing a rolling memory upgrade of machines in a redundant VoltDB cluster is an alternative. Since VoltDB uses active replication within a cluster, removing and replacing a node can be done transparently to a user’s application with no data loss and often-unnoticeable performance impact.

In-Memory Database provisioning… A New Way of Thinking

If you are provisioning and rolling out a new application using an in-memory database you will have to shift your thinking about how you size your deployment. With all data residing in memory you will need to provision enough memory in your database cluster to hold all of your data, with a bit more memory for regular operating system operations.

Disks are only used for durability, and most in-memory systems like VoltDB work great with commodity hardware, disks included. Sizing for IOPS is a thing of the past!

Database application In-memory database operating system Memory (storage engine) cluster VoltDB Data (computing) Machine

Published at DZone with permission of John Piekos, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Database Integration Tests With Spring Boot and Testcontainers
  • Send Email Using Spring Boot (SMTP Integration)
  • Integration Testing Tutorial: A Comprehensive Guide With Examples And Best Practices
  • 4 Expert Tips for High Availability and Disaster Recovery of Your Cloud Deployment

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: