Choosing Between an In-Memory and a Traditional DBMS
Choosing Between an In-Memory and a Traditional DBMS
When I deciding which DBMS to use — in-memory or traditional — the first thing you need to do is divide your options based of storage type: RAM, SDD, or HDD.
Join the DZone community and get the full member experience.Join For Free
Discover Tarantool's unique features which include powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU core!
In this article, I’ll talk about when I use an in-memory database (IMDB) and when I prefer a traditional DBMS (RDBMS) and why.
When I need to decide which DBMS to use , I usually make a choice based on the type of storage where my data will be kept. I personally divide the numerous options out there into three groups: RAM, solid-state drives (SSDs) and hard disk drives (HDDs). So first I pick a type, or types, of storage to use, and then I start thinking about the database, or databases, I want to use on top.
Each of the three data storage types has its own distinct access time and speed, and each falls into a different price range. Both an IMDB and an RDBMS can work with all three types. An IMDB stores a copy of data in RAM and persists this data toan SSD or HDD. An RDBMS uses RAM for caches, write-back buffers and other optimizations, and also persists data toan SSD or HDD. Since both DBMSes can use all three storage types, before deciding on a DBMS, I also need to decide which data I’m going to keep in each type, and how I’m going to move it between types .
Let’s recall the characteristics of each storage type.
It’s well known that RAM is lightning fast, in terms of both sequential and random access (it was named “random-access memory” for a reason). What I mean by “lightning fast” is that random access time is a mere tens of nanoseconds (1e-8 seconds), i.e. RAM performs random reads/writes at 100 MB per second. Sequential reads/writes are even faster — 1 GB per second and higher. You can find out more information about how RAM works here.
SSDs are slower than RAM. Random access time is a fraction of a millisecond (1e-5 seconds) and you can randomly read around 10,000 blocks per second. In the worst case scenario, where you only need one byte from each block, the speed is 10 kB per second. That’s 1,000 times slower than RAM! However, sequential reads/writes are performed at around200–300 MB per second, which is somewhat on par with RAM. Details about SSDs are here and here.
HDDs are theslowest of the three storage types in question. Random access time is 10 milliseconds (1e-3 seconds), which is 100 times slower than an SSD. You can randomly read/write 100 blocks per second, so if you need to read random bytes, the speed is only 100 bytes per second in the worst case! That’s because the whole block needs to be read for the sake of one random byte. Randomly writing bytes is, on average, even slower: we first read a random block, wait for one full disk rotation, and then write this block back with a changed byte. Details can be found here and here.
There are other storage types, such as flash memory, SAS, magnetic tape and many more, that I’ve never used for certain reasons. SAS is slower than SSDs but costs almost the same (at least givenour purchasing volumes at Mail.Ru Group). Flash memory, on the other hand, is more expensive than SSDs, but in my use cases, the speed gain it provides is negligible. As for magnetic tape, in terms of storage volume in my use cases, a tape drive is more expensive than a high capacity HDD, but at the same time slower with respect to access time. Your mileage may vary, and HDDs, SSDs and RAM may not be your main storage types.
Let’s talk price now. To be honest, up to a certain point in my career, I believed that I didn’t need to think about price, only about technology. I would talk to systems administrators to find out the hardware needed for my programs, then I would ask management for it. They would or wouldn’t allocate the funds, and if they did, I would place an order, receive the hardware, and start using it (and if no funds were allocated, I had to make do with whatever was available at the time). What a life that was!
However, at some point, I found out, to my surprise, that it can be the other way around, as in “Here’s a fixed budget — make everything work.” (It doesn’t strike me as surprising anymore, but back in the day it was like taking a cold shower.). More challenging tasks were met with “Given our current spending patterns, make everything work faster.” There was also “Here’s less money but performance must remain the same.” But the most hardcore one was “Cut costs and make everything work faster.” It’s even more fun, though, to execute ideas regarding how to save more money or squeeze more performance out of hardware without ever involving senior management.
So back to the price. At first glance, the situation is pretty straightforward: on average, each subsequent storage type is ten times more expensive than the previous one. HDDs are the cheapest storage type, SSDs are ten times more expensive than HDDs, and RAM is ten times more expensive than SSDs. Note that these are average figures and based entirely on my own experience (given our purchasing prices).
Another thing to keep in mind is that storage doesn’t exist on its own — it needs to be plugged into servers, and they aren’t free. Servers can only hold a certain amount of data from each storage type. On top of that, a server takes up units in a server rack, which also costs money for electricity bills and rental fees (even in your own data center, renting a server rack isn’t free and must be accounted for in your business model; for example, there’s the loss of expected profit since you’re not leasing the server to others, and so on).
Now that we’ve discussed the speed and the price relationship of the three storage types, let’s turn to when and what to store in each of them.
If I need access time of 1 ms or less, my only choice is RAM. SSDs also match this criterion, but only in terms of seek time. However, in my typical workflow, to process a single user request, not one, but several seek operations are often needed, and then, due to some additional requirements, there may be extra requests to disk. In anticipation of this scenario, I usually budget for RAM right from the start.
If I need access time of up to 100 ms, I opt for an SSD.
If an access time over 100 ms is acceptable, I consider using an HDD. Recall that its seek time is 10 ms, but in real-life use cases, random access leads to multiple seeks. Moreover, disks are busy doing other work, which diminishes the maximum number of IOPS, seek times and seeks per second that can be performed.
Let’s now dwell on sequential access a little. While in terms of random access time, an HDD is 100 times slower than an SSD, which is 1,000 times slower than RAM, the situation with sequential access time is hugely different: an HDD is two to three times slower than an SSD, which is three to five times slower than RAM. And to refresh your memory, RAM is 10 times more expensive than an SSD, which is 10 times more expensive than an HDD.
What does that tell us? Random access to slow storage is more costly (the slower the storage, the higher the cost), and sequential access to slow storage is cheaper (the slower the storage, the lower the cost). Thus, slow storages aremade for sequential access (otherwise, they would have faded into oblivion a long time ago). To reiterate, if we need to sequentially access massive datasets and are satisfied with aspeed of 100 MB/s, the most cost-effective storage is an HDD. It’s 10 times slower than RAM with respect to sequential access time, but 100 times cheaper as well. This means that in terms of bytes, we could buy 10 times more free HDD space than RAM with the same total throughout. In most cases, I’m OK with 100 MB/s, because I also need to take into account the network interface and the internet connection speed of an end user. So I store everything that needs to be accessed sequentially on HDDs. It’s important to note that I’m talking only about my situation here. If you operate on a smaller scale, the price may have a lower priority.
To sum up, I group my data into the four categories (items 1–4 above) and keep each category in separate storage. Now, the question is whichDBMS(es) to use on top of it all.
Let me first share my observations of some advantages and disadvantages of IMDBes and RDBMSes:
- In terms of writes and reads from disk, an IMDB is faster than an RDBMS, even if a machine hosting an RDBMS has so much memory on board that all data fits into it.
- An IMDB can efficiently persist data on an HDD. An RDBMS doesn’t usually work with an HDD as efficiently and requires an SSD (or anotherfast storage type like SAS or flash memory).
- An IMDB usually scales poorly to multiple CPU cores.
Now, we can move on to making a choice between an IMDB and an RDBMS.
a. As we’ve already discussed, the data that requires online access (less than 1 ms) must be stored in RAM.
I usually put this kind of data in an IMDB, because, from my experience, it’s the fastest and most hardware-efficient solution for RAM-resident data. An RDBMS with fully cached data is tens, if not hundreds, of times slower.
As an example, let’s take the benchmark results of MariaDB, which was designed by ex-MySQL developers and is currently one of the most high-performing RDBMSes out there, at least on the open-source market. The benchmark was run by MariaDB developers themselves, so one would hope the RDBMS had been fine tuned to demonstrate peak performance. Benchmark details can be found here. The results show a million requests per second on a very expensive machine called an IBM Power8, with 20 cores and 160 hardware threads (which is equal to having 160 cores on board). Compare that to the benchmark results of an IMDB named Tarantool, which performs a million transactions on a virtual single-core AWS server. By the way, the Tarantool code is open source, so you can use it yourself to reproduce the results, for example, on a t2.micro AWS instance. The Benchmarks of MariaDB and Tarantool are different, but the purpose is the same : to squeeze every last drop of performance out of a database.
So the difference is roughly a factor of 160. One could go on and on about whether Power8 has truly parallel cores (it does have at least 20 cores), or the hardware that Amazon services run on, but, on the whole, the difference is pretty evident. It’s an honest benchmark in the sense that both vendors have polished their products to deliver top performance.
But let me reiterate one of the main disadvantages of an IMDB : usually it scales poorly to multiple CPU cores. For example, Tarantool uses three cores at best. If you want to squeeze more out of it, you need to shard it the same way as if you were distributing its workload across multiple servers. On the other hand, if Tarantool performs better on a single core than an RDBMS does on a whole server (I’m not talking about Power8 here, but a regular Supermicro server with an 8- or 16-core processor), it’s possible to keep the other cores idle or use them to run separate Tarantool instances for other services.
b. All the data that doesn’t need to be accessed online (that is with allowable access time of up to 100 ms) is stored either in an RDBMS or in an IMDB, or elsewhere (more on that later) , depending on a number of factors.
For example, if I’ve already decided to store online data in an IMDB and I’ve determined that keeping non-online data there as well doesn’t influence the system’s total cost of ownership (TCO) very much, I’m going to store this non-online data in my IMDB in order to keep the number of different DBMSes as low as possible. Otherwise, if the impact on the TCO is significant, I’ll opt for an RDBMS or something else (see item d).
c. Sometimes, seemingly illogically, I put non-online data into an IMDB, even though there’s no online data in it. Why would I do that? It’s because an IMDB uses the processor more efficiently than an RDBMS does when accessing data (it performs fewer copy operations and has more advanced in-memory data structures since it’s optimized for working with memory).
From time to time, I have a choice to make — store a 1 TB database on 10 machines having an IMDB and a 200 GB HDD each, with areplication factor of two, or use eight machines that have an RDBMS and an SSD holding a 1 TB database each, with a replication factor of eight (because the RDBMS consumes four times more CPU resources in this case). 10 machines with low-capacity HDDs may cost less than eight machines with high-capacity SSDs (if you want to know why an IMDB can make do with HDDs, whereas an RDBMS usually requires SSDs, check my article ). Moreover, compared to an RDBMS, an IMDB may have cheaper and less powerful processors on board: if I have a lot of machines when using an IMDB, it’s not because a single processor can’t handle the workload — it’s that usually we just don’t install more than 200–300 GB of RAM on a machine to keep its restart time manageable.
d. Previously, I mentioned storing data “elsewhere.” So, what are the alternatives to an RDBMS and an IMDB? One option is a simple file system.
For example, in our service Cloud@Mail.Ru, users’ files are kept in a regular file system (XFS), with copies stored in several data centers. Why is that? These files are quite large (4 MB, on average), so storing them in a DBMS creates extra workload on the processor and disks. Moreover, the files are unstructured and are always read and written in one piece. Given this pattern, any additional workload on the DBMS is regarded as superfluous and even harmful. There are a number of alternative storage types. For example, the ideal storage for historical analytical data that is accessed sequentially a lot is Hadoop on top of HDDs. But I’m digressing, let’s not forget we’re talking IMDBs vs RDBMSes here.
e. If I don’t have online data, i.e. access time of around 100 ms is fine, and I know I’ll mainly be using SSDs, then an RDBMS is the best option (barring what I said previously).
In practice, this usually comes out to 1 TB or more of data under moderate workloads and without strict requirements for access time.
f. I’d like to conclude this article by touching upon what I try not to do: I try not to mix online and non-online data in one RDBMS.
For example, I wouldn’t store users’ profiles and their authentication history in the same RDBMS with fine tuned caches, whereby users’ profiles (frequently requested data) are cache resident. I’d rather use two DBMSes — an IMDB for profiles and an RDBMS for historical data — which will solve the cold start issue, consume less memory resources, and ensure faster access to the profiles.
Thanks, and stay tuned for more articles.
Published at DZone with permission of Dennis Anikin , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.