This series of posts is continuing the work I have outlined in this series. While in the previous series, I focused on the overall picture, here I want to talk about the small things we did to speed up pretty much every aspect of the system. In some cases, those are an order of magnitude differences.
In this post, I want to talk about the following:
Such a simple thing, if you just look at the surface, and yet if you’ll profile the Json.NET code, you’ll see an interesting issue. I took an existing JSON file (about 67MB in size) and continuously wrote it to /dev/null under a profiler. The results are actually quite surprising.
In fact, this is a bit worse.
dotTrace has a great feature that allows you to remove “noise” from a trace. Here is what happens if we eliminate this call:
So, pretty obviously, this is a really expensive call. But how expensive is it, really? Note that this profiling run used a Null Stream, so there isn’t any I/O cost to consider. This is the pure computational cost of calling this method.
Let us look at a deeper profiling of this method:
Basically, before it can write anything, this code needs to go over the string and find out if it has any escape characters. If it does, it needs to write the escape sequence, then resume writing the string. This means that this code has to scan the string that it is writing several times, and much worse, it means that you can’t really do justice with those operations. Consider the string above, writing it out is actually several different calls to the writer. In fact, we have:
The problem here is that we lose a very powerful feature, the ability to batch multiple operations into a single I/O call. Sure, the TextWriter is going to be doing a bit of buffering for us, but that doesn’t help us really. Let us consider one of the most important calls we have in this code:
For the string above, we call this method twice, once for each portion of the string that isn’t escaped. That means that argument validation needs to run each and every time. And we also have to copy the memory in itsy bitsy pieces. Memory copy routines can get far greater speed if they can copy a large bunch of memory all at once, rather than being called on small sections of it.
In short, I hope that this convinced you that escaping strings is expensive. Stupidly so, when you come right down to it.
But what can you do, that is the cost of doing business, right?
Indeed, JSON.Net code is heavily optimized, and there isn’t much you can do about improving things at this point. You are stuck with the JSON format and the implications of it.
This is already a long post but bear with me while I explain the reasoning.
The blittable format isn’t JSON. We read JSON and output JSON, but we aren’t limited to the same constraints as someone who is just consuming JSON. There are two separate use cases for the document using the blittable format. The first is when we need to work with them on the server side. This includes such things as indexing, transformers, patching, etc. For those cases, if there is an escaped value, we need to get the unescaped version. In other words, I need the string with the line break in it, not the \r\n escape sequences. The other common use case, unfortunately, is when we need to write the document to the user, in which case we do need to write it out with all of the escape sequences.
What is left for us, apparently, is to choose our poison. We can optimize for either scenario, but not for both.
Except, that maybe we can. And that is what we actually did.
When we write a string value into the blittable format, we read escape sequences and translate them into their unescaped values. In other words, we take a “\n” and store just the byte 13. But we also remember its position. The string we keep talking about? It would be stored as:
The first byte (10), is the length of the string. Then we have the actual unescaped version. If the server side code wants to read this string, we can just serve it directly from this memory, without any additional work. But what about when we need to write it down, and maintain the escape sequences? That is where the numbers, in the end, come into play. We need to skip 10 bytes past the length, where we’ll encounter 3 bytes: 2, 4, 0.
The first byte is the number of escape sequences in the string. The second is the number of bytes that we can copy until we get to the next escape sequence. Which means that our code for writing string has narrowed down to:
And that, in turn, means that if you don’t have any escape sequences, we can just call memcpy directly to write to the output buffer. And if you do have escape sequences, we just need to scan the escaped characters, we don’t need to scan the entire string.
There is so much saving in the air, it feels like Black Friday.