Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Optimizing Performance of RavenDB's Indexing Process

DZone's Guide to

Optimizing Performance of RavenDB's Indexing Process

· Database Zone ·
Free Resource

RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.  

The actual process done by RavenDB to index documents is a fairly complex one. In order to understand what exactly happened, I decided to break it apart to pseudo code.

It looks something like this:

<span class="kwrd">while</span> database_is_running:
  stale = find_stale_indexes()
  lastIndexedEtag = find_last_indexed_etag(stale)
  docs_to_index = get_documents_since(lastIndexedEtag, batch_size)
  
  filtered_docs = execute_read_filters(docs_to_index)
  
  indexing_work = []
  
  <span class="kwrd">for</span> index <span class="kwrd">in</span> stale:
    
    index_docs = select_matching_docs(index, filtered_docs)
    
    <span class="kwrd">if</span> index_docs.empty:
      set_indexed(index, lastIndexedEtag)
    <span class="kwrd">else</span>
      indexing_work.add(index, index_docs)
      
  <span class="kwrd">for</span> work <span class="kwrd">in</span> indexing_work:
  
     work.index(work.index_docs)

And now let me show you the areas in which we did some perf work:

All of which gives us a major boost in the system performance. I’ll discuss each part of that work in detail, don’t worry 

Do you pay to use your database? What if your database paid you? Learn more with RavenDB.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}