DZone
Database Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > Database Zone > Simple Indexes and Performance

Simple Indexes and Performance

See how RavenDB keeps performance high by structuring indexes and the architecture in an optimized way.

Oren Eini user avatar by
Oren Eini
·
Oct. 07, 16 · Database Zone · Analysis
Like (2)
Save
Tweet
2.67K Views

Join the DZone community and get the full member experience.

Join For Free

I outlined the scenario in my previous post, and I wanted to focus on each scenario one at a time. We’ll start with the following simple map index:

image_thumb[3]

In RavenDB 4.0, we have done a lot of work to make this scenario as fast as possible. In fact, this has been the focus of a lot of architectural optimizations. When running a simple index, we keep very few things in managed memory, and even those are relatively transient. Most of our data is in memory mapped files. That means, no high GC cost because we have very few objects getting pushed to Gen1/Gen2. Mostly, this is telling Lucene “open wide, please” and shoving as much data inside as we can.

So we are seeing much better performance overall. But that isn’t to say that we don’t profile (you always do). So here are the results of profiling a big index run:

And this is funny, really funny. What you are seeing there is that about 50% of our runtime is going into those two functions.

What you don’t know is why they are here. Here is the actual code for this:

image

The problem is that the write.Delete() can be very expensive, especially in the common case of needing to index new documents. So instead of just calling for it all the time, we first check if we have previously indexed this document, and only then we’ll delete it.

As it turns out, those hugely costly calls are still a very big perf improvement, but we’ll probably replace them with a bloom filter that can do the same job, but with a lot less cost.

That said, notice that the runtime cost of those two functions together is 0.4 ms under profiler. So while I expect bloom filter to be much better, we’ll certainly need to double check that, and again, only the profiler can tell.

Memory (storage engine) Data (computing) Document Filter (software) career Profile (engineering) garbage collection Lucene

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Top 5 Programming Languages in AI: A Comparison
  • API Security Weekly: Issue 164
  • Deploying Java Applications to AWS Elastic Beanstalk
  • Java: Why Core-to-Core Latency Matters

Comments

Database Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo