Reducing the Cost of Occasionally Needed Information
None of our read scenarios use a certain value, and our write scenarios only occasionally do. What can we do to optimize our performance?
Join the DZone community and get the full member experience.
Join For FreeConsider a case in which you need to call a function, and based on the result of the function and some other information, do some work. That work requires additional information that can only be computed by calling the function.
Sound complex? Let's look at the actual code. I tried to make it simple but with the same spirit.
public void InsertToTree(string key, object value)
{
var page = FindPageFor(key);
Holder holder;
if(page.TryGetValue(key, out holder))
{
holder.Value = value;
return;
}
if(page.Count < 16)
{
page.Add(key, new Holder(value));
return;
}
// need to split the page
var parentPage = /* how do I get this ? */ ;
}
This code has three different code paths. We are inserting a value that is already in the tree, so we update it and return. We insert a new value, but the page has enough space, so we add it and return.
The interesting case for us is that we need to split the page, which requires modifying the parent page (which may require splitting that page, etc).
So, the question is, how do we get the parent page?
This is important because we already found the parent page. We had to go through it to find the actual page we returned in FindPageFor
. So, ideally, we’ll just return the list of parent pages whenever we call FindPageFor
.
The problem is that all read
scenarios — which by far outweigh the write
operations — never need this value. What's more, even write
operations only need it occasionally. On the other hand, when we do need it, we already did all the work needed to just give it to you, and it would be a waste to do it all over again.
We started the process (this particular story spans four years or so) with adding an optional parameter. If you passed a not null value, we’ll fill it with the relevant details. So far, so good, reads will send it null, but all the write
s had to send it, and only a few of them had to actually use it.
The next step was to move the code to use an out lambda. In other words, when you call us, we’ll also give you a delegate that you can call, which will give you the details you need. This way, you’ll only need to pay the price when you actually needed it.
It looks like this:
public Page FindPageFor(string key, out Func<Cursor> findParents)
{
// ...
}
However, that led to a different problem. While we only paid the price of calling the lambda when we needed it, we still paid the price of allocating the lambda itself.
The next step was to create two overloads — one used only for read
s and one for write
s. The writes one allows us to send a struct out parameter. The key here is that because it is a struct, there is no allocation, and the method is written so we put the values that were previously captured on the lambda on the struct. Then, if we need to actually generate the value, we do that — but have no additional allocations.
And this post is now about six times longer than the actual optimization itself.
Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments