RavenDB Retrospective: Explicit Indexes and Auto Indexes
RavenDB is built to help avoid queries falling over when data sets become too large. How does it accomplish this? Read on to find out more.
Join the DZone community and get the full member experience.Join For Free
* That isn’t actually true, we have Data Exploration, which does just that, but we don’t provide an explicit API for it, and it is a DBA driven feature (I wanna get this report with a minimum of fuss without regards to how much it is going to cost me) than an API that is exposed.
What this means is that the cost of query operations in RavenDB is always going to be O(logN), instead of O(N). How does this relate to the topic of RavenDB retrospectives?
One of the things that I kept seeing over and over as a database consultant was that databases are complex and that it is easy to write a query that works perfectly fine for a period of time, then fall over completely as the size of the data goes over a certain threshold. In particular, queries that use table scans are particularly vulnerable for this issue.
One of the design goals for RavenDB was to avoid that, completely. We did it by simply forbidding any query that doesn’t have an index. initially, that was a pretty annoying requirement, because every time that you needed a new query, you needed to go ahead and create an index. But early on we got the Auto Indexes feature.
Basically, it means that when you can query RavenDB without specifying which index you want to use, at which point the query optimizer will inspect the query and decide which index can serve it. The most interesting point here is that if there isn’t an index that can serve this query, the query optimizer is going to create one on the fly. See the previous post about BASE indexes and how we can afford to do that.
The fun part here is that the query optimizer is actually learning over time, and it will shape its indexes to best fit the kind of queries you are doing. It also makes RavenDB much more robust for New Version Degradation effects. NVD is what happens when you push a new version out, which have slightly different queries, which make previously used indexes ineffective, forcing all your queries to become full table scans. Here is an example of the kind of subtle issues that this can cause. With RavenDB, when you use auto indexes (in other words, when you don’t explicitly state which index to use), the query optimizer will take care of that, and it will create all the appropriate indexes (and retire the unused ones) for you.
This, in particular, is a feature that I’m really proud of. It requires very little from the user to work with, and it gets the Right Thing Done.
Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.