Curator's Note: The content of this article was originally written by Adam Abrevaya over at the NuoDB blog.
Sharding is generally done for one or both of two reasons:
- To increase I/O bandwidth
- To partition in-memory data for more efficient usage of distributed caching
The former is simply about accelerating I/O-bound workloads, such as large scans or logging-style workloads. To take an obvious example: imagine doing a “SELECT count(*)” on a very large table with no “WHERE” clause or without an index. In this case the database system has to read every row, and unless you have a large amount of memory relative to the table size and have the table substantially preloaded into memory (e.g., you’re running the query for a second time) you are going to be disk I/O bound. Your query will run at disk/SSD speed, no matter what database you are using. The WRITE/UPDATE case is analogous.
By splitting the database into multiple databases you can, in certain cases, accelerate this. If your multiple databases are carefully arranged relative to your queries then there are circumstances in which you can obtain some I/O parallelism. Put rows with keys that start with A-H in database 1, I-P in database 2, and Q-Z in database 3. Then write your application so that it queries the right database, and query all three in parallel when appropriate (eg the “SELECT count(*)” example above). Here’s a picture of that:
The second reason for sharding relates to caching. For the purposes of discussion imagine a shared-disk MySQL, PostgreSQL, or SQL Server database system with multiple database server machines. If queries are randomly distributed to the machines then each machine will be caching a mix of data from the different queries. A given query might not go to the machine that has the most relevant data in cache. In fact all of the machines may have ejected the relevant data whilst maintaining multiple distributed copies of some other data. This is obviously very inefficient.
If, in our example, you could arrange to group similar queries/updates and direct each group at a particular machine then the machines will cache data for those specific interactions and you will have a much higher rate of cache hits. Sharding achieves this naturally because you have multiple databases managed by different machines and queries/updates get directed to the right machines.
Sharding is used widely, but that doesn’t make it a great idea. There are lots of reasons that you should avoid sharding if you possibly can:
- It doesn’t work in the general case. It’s easy to contrive simple cases in which a database can be readily partitioned into several databases and “scale-out” can be achieved. A good example is single-record Primary Key lookups/updates. If all you want to do is GET/PUT a record in the database system then that would work. But the moment you want to run a QUERY/UPDATE that touches rows in multiple partitions things start getting very difficult. It is probably very slow and the results are non-transactional. When someone tells you they have a sharded database solution that goes very fast you should take careful note of exactly what workloads they are running against exactly what schema.
- Sharding locks you into a solution to yesterday’s problem. Today most organizations do agile development, and it is a tenet of “agile” that you simply don’t know what the system might have to do this time next year. Limiting the range of database interactions by sharding the system is a pretty harsh up-front decision. It’s ironic that at a time when application developers place a high priority on flexible schemas and other don’t-need-to-plan-ahead requirements for modern database systems, that sharding is still seen as a solution to scaling a database.
- You no longer have a single consistent database. Not only are there multiple databases in terms of storage, but the state of the database is not transactionally managed. Or if it is transactionally managed then you have to use 2-phase commit protocols which are both slow and likely to freeze out other activity in the system for the duration of the transaction. Lack of transactional consistency creates major downstream costs in relation to data errors and recovery. And of course it increases application development and maintenance costs.
- A sharded system has N-times more complexity of administration. You now have to have N-times more fail-over machines which have to handle complex partial-failover scenarios. You have N-times more backups, which again are not likely to represent any actual global state of the system at any known point in time. You have to upgrade, optimize, tune, modify schemas, etc for N machines. You also have to manage security on N machines. Some people enjoy that stuff, but that doesn’t make it optimal. Consider what needs to happen when N shards are not enough – the sharding algorithm would need to change in careful orchestration with a data migration activity.
- Cost, cost, cost! The future is surely about systems that require less and less maintenance. We want to build systems in which a single administrator can manage thousands of databases. This already exists for the Web server layer of the stack, the application server layer, the storage layer, the network layer, etc. Sharding takes the database layer in the opposite direction. For reasons outlined in (4) above, sharded database solutions are very costly to maintain. Like most costly things, you should only do this if you have to. But if you don’t have to then it makes no sense.
What is the NuoDB answer?
The capabilities are easy to describe, but the mechanisms are somewhat complex. The NuoDB capabilities are as follows:
- NuoDB delivers all the benefits of sharding and more, without sharding.
- Although the system runs on many servers there is only one logical database, and any query is allowed at any time on any data in the database.
- All interactions with the database are transactional (ACID) with all the data guarantees you would be used to in a traditional database system.
- NuoDB scale-out and scale-in is dynamic, simple and immediate.
- Depending on the connection load-balancing scheme that you adopt you may need to tweak some parameters in the NuoDB Broker.
Sounds great, so how do we achieve all of that?
The parallel I/O aspect of sharding is handled at the storage-level, not at the database-level. NuoDB separates those issues very cleanly, and in fact can run on top of Key Value stores that are capable of doing parallel I/O. In contrast to traditional database systems, NuoDB stores everything as serialized objects, and it doesn’t care how or where they are stored as long as they can be retrieved when asked for. If you design the database system well then I/O bottlenecks can be handled by systems that are explicitly designed to do this. Notably on public clouds this is exactly what is provided anyway, with blob stores like Amazon S3 taking care of all the storage layer stuff.
Note: NuoDB can also use multiple “Storage Managers (SM)” for any given database, primarily for purposes of redundancy (e.g. for resilience to SM failure, or in lieu of backups). You can have multiple entire copies of your database, maintained in a transactionally consistent state in the location of your choice. This can improve performance as there is a natural multiplexing effect. We at NuoDB don’t typically talk about this multiplexing effect, but it is part of the architecture and works quite well.
The cache efficiency benefit of sharding can be achieved in NuoDB by simply ensuring that similar queries are sent to the same set of machines by the connection load-balancer (the “Broker”). NuoDB has pluggable load balancer policies, and a number of them are available out-of-the-box. A common approach to directing similar queries to specific servers or server groups is to use a hash key as a hint to the load balancer. Check-out the samples/plugin directory of your NuoDB installation to see sample code that implements a load balancer policy.
To add capacity you add a machine either through our WebConsole, by adding a Transaction Engine process and in a single-second timeframe your new machine will be up and contributing to capacity. All of this can be automated as well through our nuodbmanager command line interface, see: NuoDB Manager Doc. The brokers immediately know about the new machine so that new connections to the database are mediated without impacting the application logic.
It’s not directly part of this topic, but it is worth noting that a side-effect of the NuoDB architecture is that it offers an intelligent distributed cache to the application. In consequence the design not only obviates the need for sharding, but it also removes the need for caches such as Memcached (see previous blog post on this topic:memcached-vs-nuodb-caching). With NuoDB you get very high performance caching behavior for free.
In pictures, here’s what sharding looks like the NuoDB way:
First, we start with a NuoDB Domain that has 3 Hosts. A NuoDB Domain can be configured on demand through a Management Client.
Then a one-machine database is started. The Storage Manager and Transaction Engine are on the same host.
Then we add a Transaction Engine and Storage Manager on a second host, increasing throughput and adding redundancy.
Finally we add a third machine, which is in essence adding a shard.
“NuoDB Sharding” is literally as simple as adding a new machine with a NuoDB Transaction Engine. Note that these pictures don’t include multiple brokers, but we recommend setting up brokers redundantly as well.
No application code was harmed in the making of this distributed database, because:
- NuoDB Brokers hide topology changes from the application
- NuoDB takes care of managing transactional consistency
Database sharding is an approach that has served the industry well over a period of years. But as we move to systems that need to be dynamic, provide on-demand capacity, enable unknown future developments, reduce admin and maintenance costs, and deliver extreme performance, we need to think about next generation database architectures like NuoDB.
If you are looking at building a sharded database solution then it is highly recommended that you take a look at NuoDB as an alternative. We would love to talk to you about the challenges that you are facing.