Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Codex KV: Properly Generating the File

DZone's Guide to

Codex KV: Properly Generating the File

For this post, I’m going to be rewriting the CodexWriter class as I would for code that is going into RavenDB.

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

The previous post has a code sample in it that was figuratively physically painful for me to write. Avoiding the number of syscalls that are invoked, the code isn’t all too efficient as I now measure things. It uses way too much managed memory and it is subject to failures as we increase the amount of data we push through. For this post, I’m going to be rewriting the CodexWriter class as I would for code that is going into RavenDB.

I’m sorry, there is going to be a big jump in the complexity of the code because I’m going to try to handle performance, parallelism, and resource utilization all at once. The first thing to do is to go into the project’s settings and enable both unsafe code (without which is it nearly impossible to write high-performance code) and C# 7.3 features, we’ll need these.

We can divide the task of gathering the inputs into several stages. First, we need to write the data to the file. This is similar to the way we did it before, here is the Add() method:

public void Add(string str)
{
    if (_alreadSeen.Add(str) == false)
        return; // already exists

    _currentOffsets.Add(_data.Position);

    _dataWriter.Write(str);

    if (_currentOffsets.Count >= 16 * 1024)
    {
        CreateSegment();
    }
}

As you can see, there isn’t really much that changed here, but we have this notion of a segment, which is created every million keys. But what is this segment?

It is a way to refer to a specific section of records in the file. In particular, it has just one primary role: it exists to sort the records. Let’s take a look at the code:

private unsafe class Segment : IComparable<Segment>, IComparer<long>, IDisposable
{
    private readonly CodexWriter _parent;
    private int _current;
    public Task Running;
    private byte* _rawData;
    private readonly List<IDisposable> _toDispose = new List<IDisposable>();
    private readonly List<long> _offsets;

    public Segment(CodexWriter parent, List<long> offsets)
    {
        _parent = parent;
        _offsets = offsets;
        var dataMap = MemoryMappedFile.CreateFromFile(_parent._data, null, 0, MemoryMappedFileAccess.Read, HandleInheritability.None, true);
        _toDispose.Add(dataMap);
        var dataAccessor = dataMap.CreateViewAccessor(0, _parent._data.Length, MemoryMappedFileAccess.Read);
        _toDispose.Add(dataAccessor);
        dataAccessor.SafeMemoryMappedViewHandle.AcquirePointer(ref _rawData);
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static int ReadVariableSizeInt(ref byte* buffer)
    {
      // redacted
    }

    public Span<byte> GetKeyAtOffset(long index)
    {
        var aLoc = _rawData + index;
        var aSize = ReadVariableSizeInt(ref aLoc);

        return new Span<byte>(aLoc, aSize);
    }

    public Task EnsureSorted()
    {
        if (Running != null)
        {
            return Running;
        }
        var task = new Task(() =>
        {
            _offsets.Sort(this);
        });
        var result = Interlocked.CompareExchange(ref Running, task, null);
        if (result == null)
            task.Start();

        return result ?? task;
    }

    public int Compare(long x, long y)
    {
        var a = GetKeyAtOffset(x);
        var b = GetKeyAtOffset(y);
        return Compare(a, b);
    }

    [DllImport("msvcrt.dll", EntryPoint = "memcmp", CallingConvention = CallingConvention.Cdecl, SetLastError = false)]
    public static extern int Compare(byte* b1, byte* b2, long count);

    private int Compare(Span<byte> a, Span<byte> b)
    {
        fixed (byte* pb = b)
        fixed (byte* pa = a)
        {
            var size = Math.Min(a.Length, b.Length);
            var result = Compare(pa, pb, size);
            if (result != 0)
                return result;
            return a.Length - b.Length;
        }
    }
}

There are a few key points. Instead of using file I/O directly, we are using memory mapped files. Why is that? Because, as we have seen, the cost of syscalls is non-trivial in the extreme, and using memory mapped files means that we can access the data natively without having to pay any price aside from page fault if the data isn’t already in memory.

The EnsureSorted() method is also interesting because it spawns a new task to sort the entries inside the segment in parallel with inserting the data to the main file. The actual sort is handled in the Compare() methods.

As we write the data into the codex, we sort the data as we run through it, but what happens in the end? In this case, we have about 13 million items that we inserted, so we have 13 segments that are each individually sorted. To get the final sort, we basically merge from all of them. Here is the relevant code:

public void Complete()
{
    if (_currentOffsets.Count > 0)
        CreateSegment();
    var heap = new SortedSet<Segment>();
    foreach (var segment in _segments)
    {
        segment.Running.Wait();
        heap.Add(segment);
    }

    int count = 0;
    while (heap.Count > 0)
    {
        count++;
        var min = heap.Min;
        heap.Remove(min);
        _dataWriter.Write(min.GetCurrentOffset());
        if (min.MoveNext())
            heap.Add(min);
    }
    _dataWriter.Write(count);
}

private class Segment 
{

  [DllImport("msvcrt.dll", EntryPoint = "memcmp", CallingConvention = CallingConvention.Cdecl, SetLastError = false)]
  public static extern int Compare(byte* b1, byte* b2, long count);

  private int Compare(Span<byte> a, Span<byte> b)
  {
      fixed (byte* pb = b)
      fixed (byte* pa = a)
      {
          var size = Math.Min(a.Length, b.Length);
          var result = Compare(pa, pb, size);
          if (result != 0)
              return result;
          return a.Length - b.Length;
      }
  }

  public int CompareTo(Segment other)
  {
      if (_current >= _offsets.Count)
          return -1;
      if (other._current >= other._offsets.Count)
          return 1;

      return Compare(GetCurrentKey(), other.GetCurrentKey());
  }
}

This used the SortedSet as a heap to always get the minimum value from the sorted inner values in the set. Note that we need to wait for the parallel searches to complete, then merge from all of them to the final result. We can write the result of the sort directly to the end of the file.

Overall, this process takes 59.01 seconds to complete. Remember that this is when we are pushing unsorted data through. If we pass the data sorted, we get a significant improvement and it only takes 35.91 seconds.

To compare, I run the same sort of test on Voron, and I got 59.15 seconds for the unsorted case and for the sorted case, I got 13.85 seconds. This is when Voron is also doing ACID writes, which we obviously don’t in Codex.

I guess that spending four to five years with a whole team doing performance optimization is a better way to get storage performance than a couple of evenings hacking before I go to bed. Who knew?

Do you pay to use your database? What if your database paid you? Learn more with RavenDB.

Topics:
database ,kv store ,ravendb ,c#

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}