Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Optimizing Performance of RavenDB's Indexing Process

DZone's Guide to

Optimizing Performance of RavenDB's Indexing Process

· Database Zone
Free Resource

Running out of memory? Learn how Redis Enterprise enables large dataset analysis with the highest throughput and lowest latency while reducing costs over 75%! 

The actual process done by RavenDB to index documents is a fairly complex one. In order to understand what exactly happened, I decided to break it apart to pseudo code.

It looks something like this:

<span class="kwrd">while</span> database_is_running:
  stale = find_stale_indexes()
  lastIndexedEtag = find_last_indexed_etag(stale)
  docs_to_index = get_documents_since(lastIndexedEtag, batch_size)
  
  filtered_docs = execute_read_filters(docs_to_index)
  
  indexing_work = []
  
  <span class="kwrd">for</span> index <span class="kwrd">in</span> stale:
    
    index_docs = select_matching_docs(index, filtered_docs)
    
    <span class="kwrd">if</span> index_docs.empty:
      set_indexed(index, lastIndexedEtag)
    <span class="kwrd">else</span>
      indexing_work.add(index, index_docs)
      
  <span class="kwrd">for</span> work <span class="kwrd">in</span> indexing_work:
  
     work.index(work.index_docs)

And now let me show you the areas in which we did some perf work:

All of which gives us a major boost in the system performance. I’ll discuss each part of that work in detail, don’t worry 

Running out of memory? Never run out of memory with Redis Enterprise databaseStart your free trial today.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}