Recently, I presented a Tech Talk on sharding, its complications, and complexity.
There are many pros and (even more) cons to sharding, which can be summarized with these two questions:
Why do some DBAs think sharding is the only end game?
Sharding is the only well-known way to scale-out write transactions. Scaling-up has the limit of the maximum size of instance available. Read slaves provide additional read scale, but at the cost of latency, which increases as the write-master nears capacity. Master/Master (including the Galera Replication library, as leveraged by MariaDB Cluster and Percona XtraDB Cluster) would ideally double (or more) the write transaction volume. But if it’s active/active, transaction latency increases. Else if it’s active/passive, then you need to design conflict resolution to ensure the secondary master’s writes don’t conflict with the primary. And Galera only provides limited write scale due to requiring the same updates be applied to all the nodes in the cluster.
What are the long-term costs of sharding?
Cross-node transactions are problematic with sharding. Since no RDBMS manages all the separate shards together, either cross-node transactions are removed (either not tolerated, or supported by cross-node replication) or new application logic must be created and maintained. These kinds of significant application changes are required not just at build time, but as ongoing application maintenance, or else business rule flexibility suffers. Very complicated infrastructure is needed. Each shard typically has two to three members, including a primary and secondary/DR. Pinterest uses 4:2 nodes in master/master for HA (only one accepting application transactions), a slave for additional DR, and another slave to run backups. Then you need some kind of the shard-key management, typically a KVS like memcache, and its redundant systems. All of this adds up to a lot of (redundant) hardware and DBA plus DevOps OPEX.
Some of the largest companies have challenges sharding MySQL.
Facebook invested millions building an automated sharding framework and is okay with latency between shards. Google replaced their in-house sharded array(s) of MySQL with F1, and later with Spanner, which itself doesn’t support online schema changes, nor dynamic PK/FK referential integrity, nor is it MySQL native.
Even Oracle 12c’s new release, which leverages sharding, says:
Transactions that require high performance must be single-shard transactions that typically access 10s or 100s of rows. For example, lookup and update of a customer’s billing record, lookup and update of a subscriber’s documents, etc. There is no communication or coordination between shards for high performance transactions. Multishard operations and non-shard key access are also supported but with a reduced expectation for performance. — Oracle Sharding FAQ (emphasis added)
It is worth noting that even the biggest player in the room has difficulties with scaling-out sharded OLTP high-performance, low-latency transactions.
In the end, sharding will always require a separate RDBMS for each shard, which means any coordination, specifically cross-shard ACID transactions, are either unsupported or require significant application and/or middleware customization.