I have been blogging for a long time now, and I am quite comfortable in expressing myself, but I was still blown away by this post to the RavenDB mailing list. Mostly because this thread sums up a lot of the core points that led me to design RavenDB the way it is today.
Rasmus Schultz has been able to put a lot of the thought processes behind the RavenDB design into words.
Back when I took my education in systems development, basically, I was taught to build aggregates as large, as complete and as connected as possible. But that was 14 years ago, and I'm starting to think, what they taught me back then was based on the kind of thinking that works for single-user, typically desktop applications, where the entire model was assumed to be in-memory, and therefore had to be traversible, since there was no "engine" you could go back to and ask for another piece of the model.
I can see now why that doesn't make sense for concurrent applications with large models persisted in the background. It just never occurred to me, and looked extremely wrong to me, because that's not how I was taught to think.
Yes. That is the exact problem that I see people run into over and over. They create a highly connected object model, without regards to how they are persisted, and then they run into problems using them. And the assumption that everything is equally costly to read from memory is hugely expensive.
Furthermore, I'm starting to see why NHibernate doesn't really work well for me. So here's the main thing that's starting to dawn on me, and please confirm or correct me on this:
It seems that the idea behind NH is to configure the expected data-access strategies for the model itself. You write configuration-files that define the expected data-access strategies, but potentially, you're doing this based on assumptions about how you might access the data in this or that scenario.
The problem I'm starting to see, is that you're defining these assumptions statically - and while it is possible to deviate from these defined patterns, it's easy to think that once you've defined your access strategies, you're "done", and the model "just works" and you can focus on writing business logic, which too frequently turns out to be untrue in practice.
To be fair, you can specify those things in place, with full context. And I have been recommending to do just that for years, but yeah, that is a very common issue.
This contrasts with RavenDB, where you formally define the access strategies for specific scenarios - rather than for the model itself. And of course the same access strategy may work in different scenarios, but you're not tempted to assume that a single access strategy is going to work for all scenarios.
You're encouraged to think and make choices about what you're accessing and updating in each scenario, rather than just defining one overriding strategy and charging ahead blindly on the assumption that it'll always just work, or always perform well, or always make updates that are sufficiently small to not cause concurrency problems.
Am I catching on?