So the first thing to tackle is the over the wire protocol. RavenDB is a REST based system, working purely within HTTP. A lot of that was because at the time of conception, REST was pretty much the thing, so it was natural to go ahead with that. For many reasons, that has been the right decision.
The ability to easily debug the over the wire protocol with a widely available tool like Fiddler makes it very easy to support in production, script it using curl or wget, and in general make it easier to understand what is going on.
On the other hand, there are a bunch of things that we messed up on. In particular, RavenDB is using the HTTP headers to pass the document metadata. At the time, that seemed like an elegant decision and something that is easily in line with how REST is supposed to work. In practice, this limited the kind of metadata that we can use to stuff that can pass through HTTP headers, and forced some constraints on us (case insensitivity, for example), and there have been several cases where proxies of various kind inserted their own metadata that ended up in RavenDB, sometimes resulting in bad things happening.
When looking at RavenDB 4.0, we made the following decisions:
- We are still going to use HTTP as the primary communication mechanism.
- Unless we have a good reason to avoid it, we are going to be human readable over the wire.
So now, instead of sending document metadata as HTTP headers, we send them inside the document and use headers only to control HTTP itself. That said, we have taken the time to analyze our protocol and found multiple places where we can do much better. In particular, the existence of web sockets means that there are certain scenarios that just became tremendously simpler for us. RavenDB has several needs for bidirectional communications. Bulk inserts, Change API, subscriptions, etc.
The fact that web sockets are now pretty much available across the board means that we have a much easier time dealing with those scenarios. In fact, the presence of web sockets, in general, is going to have a major impact on how we are doing replication, but that will be the topic of another post.
Beyond the raw over the wire protocol, we also looked at a common source for bugs, issues, and trouble: IIS. Hosting under IIS means that you are playing by IIS rules, which can be really hard to do if you aren’t a classic web application. In particular, IIS has certain ideas about size of requests, their duration, the time you have to shut down or start up, etc. In certain configurations, IIS will just decide to puke on you (if you have a long URL, it will return 404, with no indication why, for example), resulting in escalated support calls. One particular support call happened because a database took too long to open (as a result of a rude shutdown by IIS), which resulted in IIS aborting the thread and hanging the entire server because of an abandoned mutex. Fun stuff like that.
In RavenDB 4.0, we are going to be using Kestrel exclusively, which simplify our life considerably. You can still front it with IIS, of course (or ngnix, etc), for operational, logging, etc. reasons. But this means that RavenDB will be running its own process, worry about its own initialization, shutdown, etc. It makes our life much easier.
That is about it for the protocol bits, in the next post, I’ll talk about the most important aspect of a database, how the data actually gets stored on disk.
More posts in "The design of RavenDB 4.0" series:
- (12 May 2016) Separation of indexes and documents
- (10 May 2016) Voron has a one track mind
- (05 May 2016) Physically segregating collections
- (03 May 2016) Making Lucene reliable
- (28 Apr 2016) The implications of the blittable format
- (26 Apr 2016) Voron takes flight
- (22 Apr 2016) Over the wire protocol
- (20 Apr 2016) We already got stuff out there
- (18 Apr 2016) The general idea