If you’re a regular reader on this blog then you see us use the word “architecture” a lot. Like, a lot. Even if something isn’t tagged with architecture it’s what we’re really getting at when we dive into our use of MVCC, or caching or replication.
In a way it’s interesting that we write and talk so much about what happens within our system because our high-level goal has always been to make NuoDB as simple to use as possible. The theme over the next few months on this blog is going to be around all the things we’re adding or extending to make it easier to work with a database at scale. Part of that is handling all the complicated, architectural details internally and making the user-experience about picking a few simple settings and then getting on with the task of developing an application or maintaining some service.
So why are we always on about the architecture if we don’t want our users having to think about it? Honestly, I think one reason is because it’s really cool. As we’d say here in Boston, it’s “wicked.” When you build large-scale systems you get excited diving into a new design and trying to understand how the pieces fit together. The whole NewSQL movement has brought with it several attempts at providing the kind of scale and resiliency that’s traditionally hard to provide and that’s fun to sort through.
That’s kind of the second reason that we talk a lot about how our system works. Databases have been around a long time, and people have a lot of ideas about what it takes to design one. Some of the claims we make about scale and elegance seem hard to believe at first, and our users really want to understand what makes us tick. That’s part of why I just wrote a two-part article for InfoQ on our architecture (and the link to part 2).
NuoDB is built around a pretty radically different architecture than traditional SQL databases. Often, even if people want to use our database as a self-managing service that requires little interaction they still want to understand all the pieces. And that gets to the third reason I think people want to understand the architecture: they want to understand what changes because of the architecture, and how they can solve old problems in new ways (e.g., see my recent piece about backup strategies).
The bottom-line is that designing for scale isn’t about taking random pieces, fitting them together and hoping all your technologies play nice. Lots of great systems have been built from collections of technologies, but you have to think carefully about how they interact. I mean, you don’t usually layer two caches on top of each other unless you’re really sure that their policies don’t conflict. It’s our end-to-end architecture that lets us tackle scale-out problems one day, massive density problems the next and then turn around and provide automation without having to worry if all the pieces line-up.
In other news
One thing that happened this week is the acquisition of Akiban by FoundationDB. I’m not too surprised by this kind of move. Increasingly what we’re seeing is that there’s a shift back to the power and expressiveness of SQL. I’ll be writing more about that soon so I’ll pass on that topic for now.
Something that’s interesting, at least as I see it, is the argument being made: it’s all about the architecture. Akiban is built for doing SQL queries that are optimized around join patterns. It’s also built assuming a key-value service for durability, which is what Foundation provides. That’s the connection here, and so the theory is that these two technologies can be bolted together to give you a scale-out experience. I say “theory” not to be derisive, but only because I don’t know enough of the details to understand in practice how well they connect.
This is similar to the design already used in NuoDB. We’re a scale-out system that separates the transaction and durability layers, and because of how we work with data internally we can layer on top of any KV service for durability. You can take advantage of different underlying stores and the simplicity of the storage interface helps provide greater fault-tolerance.
I think we’ll see more of these kinds of architectural connections in the coming months. Many of the NoSQL solutions are trying to figure out how to layer something relational or transactional onto their interfaces and many traditional databases are trying to figure out how to get scale-out properties or at least more cloud-like capabilities.
What’s the point? The whole NewSQL movement is really about taking all the great ideas and programming models that we’ve been using for a long time and challenging assumptions about what the architecture should be to support them. It’s a fun time to be in the distributed computing world. Over the next few months we’ll be talking about cool new stuff that we’re rolling out based on the agility of our architecture. In the meantime, I highly encourage you to go dig a little through what we’ve written, ask some questions and download the product to try it for yourself. It’s wicked awesome.
[ n.b.: yes, the title of this post is a nod to a well-known political construction. ]