As we are gearing up to do more & more stuff in Voron, it occurred to me that while we have settled on a good technological system for it, we haven’t settled on a real set of conventions for real use. We’re probably going to see a lot of use in Voron, and we want to see some consistency and best practices there.
Voron is a key/value store that expose a sorted tree abstraction. You can have as many trees as you would like. And the keys & values are both arbitrary byte strings. Given that, let us try to bring some order to the mix.
Don’t use the root tree
The root tree should reserved for handling of other trees, and not for the use of data of any kind. The only case where using the root tree is fine is if you don’t have any other trees. That tend to be a rare occasion, though. See next topic.
Have a $metadata tree
Always have a $metadata tree, that gives you information about the actual database you are using. For example, you’ll want to have things like the storage version, the database id (always a guid, to be able to tell dbs from their backups), etc.
Alpha numeric values only for tree names, please
You can use any value as a tree name, but you really want to limit yourself to the printable ASCII set. This is recommended because dumping the data to any other format (think debugging) would be greatly enhanced if you can actually read the tree names.
Use @tree for super trees
It is very common to have trees where all their data is handled via MutiAdd, MultiRead, etc. We call such trees super trees (they are trees that contains trees, also see big table). While they are usually used for indexing, there are many cases where you want to do that for things like queues, general run of the meal data (this is great for holding edges in a graph, for example), etc.
Prefix ix_ for all indexing trees
I’m not so sure about this, but it is worth mentioning. The purpose here is to distinguish between standard trees and trees that can be rebuild from scratch if needed. That can be of help in diagnostics mode, for example.
Dynamic trees should be explained
It is very easy to create trees in Voron. But like files, you don’t just create some around for no purpose. A tree cost 4KB (minimum), and more importantly, if you are looking at your storage, you want to be able to make sense of things. If you are using a tree as a queue for a particular destination, make sure that you name it appropriately.
Alternatively, use the $metadata tree to keep track of what goes where.
Avoid non printable key names
Just like in tree names, you can use any byte string in a key name, but you want to be able to read debug data or run diagnostics, it would really help if you could actually look at the data .
Remember, sequential writes are best
Voron will deal with random writes just fine, but it would be far faster to write if you’ll arrange things so they are sequential. It is fine if you have once in a while a random write. But try to keep things sequential if at all possible. On that node, sequential for us means increasing, all of our optimizations assumes that. Decreasing sequential data is currently not as optimized.
Write and end, delete at start
This is a common operation you need for queues. It is generally better to do that with writing of the data at the end and removing from start. If you can, avoid just deleting stuff all over the place. Again, that works just fine, but we’ve optimized Voron to handle this scenario very well.
Keep the data simple
Voron does a lot with memory mapping. If the data you can use can be read directly, you can literally just access it off our own buffer, and have no copy required at all.
And that is all I have for now…