Many database vendors would like me to take a look at their products and consider adopting them for all sorts of purposes. Often they’re pitching something quite new and unproven as a replacement for mature, boring technology I’m using happily.
I would consider a new and unproven technology, and I often have. As I’ve written previously, though, a real evaluation takes a lot of effort, and that makes most evaluations non-starters.
Perhaps the most important thing I’m considering is whether the product is mature. There are different levels of maturity, naturally, but I want to understand whether it’s mature enough for me to take a look at it. And in that spirit, it’s worth understanding what makes a database mature.
For my purposes, maturity really means demonstrated capability and quality with a lot of thought given to all the little things. The database needs to demonstrate the ability to solve specific problems well and with high quality. Sometimes this comes from customers, sometimes from a large user community (who may not be customers).
Here are some things I’ll consider when thinking about a database, in no particular order.
- What problem do I have? It’s easy to fixate on a technology and start thinking about how awesome it is. Some databases are just easy to fall in love with, to be frank. Riak is in this category. I get really excited about the features and capabilities, the elegance. I start thinking of all the things I could do with Riak. But now I’m putting the cart before the horse. I need to think about my problems first.
- Query flexibility. Does it offer sophisticated execution models to handle the nuances of real-world queries? If not, I’ll likely run into queries that run much more slowly than they should, or that have to be pulled into application code. MySQL has lots of examples of this. Queries such as
ORDER BYwith a
LIMITclause, which are super-common for web workloads, did way more work than they needed to in older versions of MySQL. (It’s better now, but the scars remain in my mind).
- Query flexibility. The downside of a sophisticated execution engine with smart plans is they can go very wrong. One of the things people like about NoSQL is the direct, explicit nature of queries, where an optimizer can’t be too clever for its own good and cause a catastrophe. A database needs to make up its mind: if it’s simple and direct, OK. If it’s going to be smart, the bar is very high. A lot of NoSQL databases that offer some kind of “map-reduce” query capability fall into the middle ground here: key-value works great, but the map-reduce capability is far from optimal.
- Data protection. Everything fails, even things you never think about. Does it automatically check for and guard against bit rot, bad memory, partial page writes, and the like? What happens if data gets corrupted? How does it behave?
- Backups. How do you back up your data? Can you do it online, without interrupting the running database? Does it require proprietary tools? If you can do it with standard Unix tools, there’s infinitely more flexibility. Can you do partial/selective backups? Differential backups since the last backup?
- Restores. How do you restore data? Can you do it online, without taking the database down? Can you restore data in ways you didn’t plan for when taking the backup? For example, if you took a full backup, can you efficiently restore just a specific portion of the data?
- Replication. What is the model—synchronous, async, partial, blend? Statement-based, change-based, log-based, or something else? How flexible is it? Can you do things like apply intensive jobs (schema changes, big migrations) to a replica and then trade master-and-replica? Can you filter and delay and fidget with replication all different ways? Can you write to replicas? Can you chain replication? Replication flexibility is an absolutely killer feature. Operating a database at scale is very hard with inflexible replication. Can you do multi-source replication? If replication breaks, what happens? How do you recover it? Do you have to rebuild replicas from scratch? Lack of replication flexibility and operability is still one of the major pain points in PostgreSQL today. Of course, MySQL’s replication provides a lot of that flexibility, but historically it didn’t work reliably, and gave users a huge foot-gun. I’m not saying either is best, just that replication is hard but necessary.
- Write stalls. Almost every new database I’ve seen in my career, and a lot of old ones, has had some kind of write stalls. Databases are very hard to create, and typically it takes 5-10 years to fix these problems if they aren’t precluded from the start (which they rarely are). If you don’t talk about write stalls in your database in great detail, I’m probably going to assume you are sweeping them under the rug or haven’t gone looking for them. If you show me you’ve gone looking for them and either show that they’re contained or that you’ve solved them, that’s better.
- Independent evaluations. If you’re a solution in the MySQL space, for example, you’re not really serious about selling until you’ve hired Percona to do evaluations and write up the results. In other database communities, I’d look for some similar kind of objective benchmarking and evaluations.
- Operational documentation. How good is your documentation? How complete? When I was at Percona and we released XtraBackup, it was clearly a game-changer, except that there was no documentation for a long time, and this hurt adoption badly. Only a few people could understand how it worked. There were only a few people inside of Percona who knew how to set it up and operate it, for that matter. This is a serious problem for potential adopters. The docs need to explain important topics like daily operations, what the database is good at, what weak points it has, and how to accomplish a lot of common tasks with it. Riak’s documentation is fantastic in this regard. So is MySQL’s and PostgreSQL’s.
- Conceptual documentation. How does it work, really? One database that I think has been hurt a little bit by not really explaining how-it-works is NuoDB, which used an analogy of a flock of birds all working together. It’s a great analogy, but it needs to be used only to set up a frame of reference for a deep-dive, rather than as a pat answer. (Perhaps somewhat unfairly, I’m writing this offline, and not looking to see if NuoDB has solved this issue I remember from years ago.) Another example was TokuDB’s Fractal Tree indexes. For a long time it was difficult to understand exactly what fractal tree indexes really did. I can understand why, and I’ve been guilty of the same thing, but I wasn’t selling a database. People really want to feel sure they understand how it works before they’ll entrust it with their data, or even give it a deep look. Engineers, in particular, will need to be convinced that the database is architected to achieve its claimed benefits.
- High availability. Some databases are built for HA, and those need to have a really clear story around how they achieve it. Walk by the booth of most new database vendors at a conference and ask them how their automatically HA solution works, and they’ll tell you it’s elegantly architected for zero downtime and seamless replacement of failed nodes and so on. But as we know, these are really hard problems. Ask them about their competition, and they’ll say “sure, they claim the same stuff, but our code actually works in failure scenarios, and theirs doesn’t.” They can’t all be right.
- Monitoring. What does the database tell me about itself? What can I observe externally? Most new or emerging databases are basically black boxes. This makes them very hard to operate in real production scenarios. Most people building databases don’t seem to know what a good set of monitoring capabilities even looks like. MemSQL is a notable exception, as is Datastax Enterprise. As an aside, the astonishing variety of opensource databases that are not monitorable in a useful way is why I founded VividCortex.
- Tooling. It can take a long time for a database’s toolbox to become robust and sophisticated enough to really support most of the day-to-day development and operational duties. Good tools for supporting the trickier emergency scenarios often take much longer. (Witness the situation with MySQL HA tools after 20 years, for example.) Similarly, established databases often offer rich suites of tools for integrating with popular IDEs like Visual Studio, spreadsheets and BI tools, migration tools, bulk import and export, and the like.
- Client libraries. Connecting to a database from your language of choice, using idiomatic code in that language, is a big deal. When we adopted Kafka at VividCortex, it was tough for us because the client libraries at the time were basically only mature for Java users. Fortunately, Shopify had open-sourced their Kafka libraries for Go, but unfortunately they weren’t mature yet.
- Third-party offerings. Sometimes people seem to think that third-party providers are exclusively the realm of open-source databases, where third parties are on equal footing with the parent company, but I don’t think this is true. Both Microsoft and Oracle have enormous surrounding ecosystems of companies providing alternatives for practically everything you could wish, except for making source code changes to the database itself. If I have only one vendor to help me with consulting, support, and other professional services, it’s a dubious proposition. Especially if it’s a small team that might not have the resources to help me when I need it most.
The most important thing when considering a database, though, is success stories. The world is different from a few decades ago, when the good databases were all proprietary and nobody knew how they did their magic, so proofs of concept were a key sales tactic. Now, most new databases are opensource and the users either understand how they work, or rest easy in the knowledge that they can find out if they want. And most are adopted at a ratio of hundreds of non-paying users for each paying customer. Those non-paying users are a challenge for a company in many ways, but at least they’re vouching for the solution.
Success stories and a community of users go together. If I can choose from a magical database that claims to solve all kinds of problems perfectly, versus one that has broad adoption and lots of discussions I can Google, I’m not going to take a hard look at the former. I want to read online about use cases, scaling challenges met and solved, sharp edges, scripts, tweaks, tips and tricks. I want a lot of Stack Exchange discussions and blog posts. I want to see people using the database for workloads that look similar to mine, as well as different workloads, and I want to hear what’s good and bad about it. (Honest marketing helps a lot with this, by the way. If the company’s own claims match bloggers’ claims, a smaller corpus online is more credible as a result.)
These kinds of dynamics help explain why most of the fast-growing emerging databases are opensource. Opensource has an automatic advantage because of free users vouching for the product. Why would I ever consider a proof-of-concept to do a sales team a favor, at great cost and effort to myself, when I could use an alternative database that’s opensource and has an active community discussing the database? In this environment, the proof of concept selling model is basically obsolete for the mass market. It may still work for specialized applications where you’ll sell a smaller number of very pricey deals, but it doesn’t work in the market of which I’m a part.
In fact, I’ve never responded positively to an invitation to set up a PoC for a vendor (or even to provide data for them to do it). It’s automatically above my threshold of effort. I know that no matter what, it’s going to involve a huge amount of time and effort from me or my teams.
There’s another edge-case—databases that are built in-house at a specific company and then are kicked out of the nest, so to speak. This is how Cassandra got started, and Kafka too. But the difference between a database that works internally for a company (no matter how well it works for them) and one that’s ready for mass adoption is huge, and you can see that easily in both of those examples. I suspect few people have that experience to point to, but probably a lot of readers have released some nifty code sample as open-source and seen how different it is to create an internal-use library, as opposed to one that’ll be adopted by thousands or more people.
Remarkably few people at database companies seem to understand the things I’ve written about above. The ones who do—and I’ve named some of them—might have great success as a result. The companies who aren’t run by people who have actually operated databases in their target markets recently, will probably have a much harder time of it.
I don’t make much time to coach companies on how they should approach me. It’s not my problem, and I feel no guilt saying no without explanation. (One of my favorite phrases is “no is a complete sentence.”) But enough companies have asked me, and I have enough friends at these companies, that I thought it would be helpful to write this up. Hopefully this serves its intended purpose and doesn’t hurt any feelings. Please use the comments to let me know if I can improve this post.