Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Optimizing Performance of RavenDB's Indexing Process

DZone's Guide to

Optimizing Performance of RavenDB's Indexing Process

· Database Zone
Free Resource

Traditional relational databases weren’t designed for today’s customers. Learn about the world’s first NoSQL Engagement Database purpose-built for the new era of customer experience.

The actual process done by RavenDB to index documents is a fairly complex one. In order to understand what exactly happened, I decided to break it apart to pseudo code.

It looks something like this:

<span class="kwrd">while</span> database_is_running:
  stale = find_stale_indexes()
  lastIndexedEtag = find_last_indexed_etag(stale)
  docs_to_index = get_documents_since(lastIndexedEtag, batch_size)
  
  filtered_docs = execute_read_filters(docs_to_index)
  
  indexing_work = []
  
  <span class="kwrd">for</span> index <span class="kwrd">in</span> stale:
    
    index_docs = select_matching_docs(index, filtered_docs)
    
    <span class="kwrd">if</span> index_docs.empty:
      set_indexed(index, lastIndexedEtag)
    <span class="kwrd">else</span>
      indexing_work.add(index, index_docs)
      
  <span class="kwrd">for</span> work <span class="kwrd">in</span> indexing_work:
  
     work.index(work.index_docs)

And now let me show you the areas in which we did some perf work:

All of which gives us a major boost in the system performance. I’ll discuss each part of that work in detail, don’t worry 

Learn how the world’s first NoSQL Engagement Database delivers unparalleled performance at any scale for customer experience innovation that never ends.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}