DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Culture and Methodologies
  3. Career Development
  4. Solving a RavenDB Technical Interview Question

Solving a RavenDB Technical Interview Question

Hibernating Rhinos goes open kimono and shows us the interview question they ask candidates applying to the firm — along with the ideal solution.

Oren Eini user avatar by
Oren Eini
·
Jun. 06, 16 · Opinion
Like (2)
Save
Tweet
Share
2.69K Views

Join the DZone community and get the full member experience.

Join For Free

For the interview question answered in this article, see this post.

So the first thing that we need to decide is what will be the data format on the trie we implemented to answer the original question. Since we have only 32KB to work with, we need to consider the actual storage carefully.

32KB is small enough to fit in an unsigned short, so all the references we’ll use will be shorts. We also need to store a bit of metadata, so we’ll use the first 4 bytes as the header for just that.

  • ushort SpaceUsed;
  • ushort LastAllocation;

Now that we have this, we need to decide how to store the actual data. To make things easy, we are going to define the following way to allocate memory:

public bool TryAllocate(sbyte size, out ushort pos)
{
    pos = 0;
    // we use one byte for the size
    if( _header->LastAllocation + size + 1 > 32*1024)
        return false;
    pos = _header->LastAllocation + 1;
    _header->LastAllocation+=size +1;
    _header->SpaceUsed+=size+1;
    _mem[pos - 1] = size;// size marker
    return true;
}

public void Free(ushort pos)
{
    _header->SpaceUsed -= _mem[pos-1]-1;
    _mem[pos-1] *= -1; // delete marker
}

This is about the simplest way that you can go about doing things. Note that we use a length prefix value, and we limit allocations to a max of 127 bytes each. We use a negative size to indicate a delete marker.

So basically, now we have a pretty trivial way to allocate memory, and we can implement the trie as we would normally. There are a few wrinkles, however.

Deleting the memory doesn’t actually make it eligible for reuse, and it is quite likely to get fragmented easily. In order to handle that, we will track the amount of space that is used, and if we get to the end of the space, we’ll check the UsedSpace value. If this is still too little, we can abort -- there is no available space here. However, if we go to the end of the buffer, but we have free space available, we can do the following:

  • Scan the buffer for available spots (find available locations that have negative size).
  • Failing that, we will copy the data to a temporary buffer, then re-add everything to the buffer from scratch. In other words, we defrag it.

Another issue we have is that the maximum size we can allocate is 127. This value is big enough that most actual strings can fit into it nicely, but a trie already has the property that a large string can be broken into pieces, so we’ll just cut each node in the trie to a max size of 127. Actually, the max size is likely to be less than that, because there is also some information that we need to keep track of per entry:

  • byte NumberOfChildren;
  • byte Flags; // node type, (internal, leaf or both)
  • ushort ChildrenPosition;

So in practice we have about 123 bytes to work with for the length. Note that we don’t store the string value of the node’s length (we can get that from the allocation information), and that we store the actual children in an array that is stored separately. This allows us to easily add items to the trie as child nodes. If the node is a leaf node, and we also need to store the actual value (which is 8 bytes), we store that information at the end of the value (giving us 115 bytes for that section of the value).

All in all, there is going to be a bit of pointer arithmetic and a bit of bit counting, but it's likely to be a pretty simple implementation.

Note that additional optimizations would be to try align everything so it will fit into a cache line, trying to place nodes near their children (which are more likely to be followed), etc.

Interview (journalism)

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Do Not Forget About Testing!
  • Best Practices for Writing Clean and Maintainable Code
  • Tech Layoffs [Comic]
  • Core Machine Learning Metrics

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: