DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations

Trending

  • Developers Are Scaling Faster Than Ever: Here’s How Security Can Keep Up
  • The Native Way To Configure Path Aliases in Frontend Projects
  • What Is JHipster?
  • Multi-Stream Joins With SQL
  1. DZone
  2. Data Engineering
  3. Databases
  4. What is New in RavenDB 3.0: Query Diagnostics

What is New in RavenDB 3.0: Query Diagnostics

Oren Eini user avatar by
Oren Eini
·
Sep. 28, 14 · Interview
Like (0)
Save
Tweet
Share
4.96K Views

Join the DZone community and get the full member experience.

Join For Free

We talked a lot about the changes we made for indexing, now let us talk about the kind of changes we are talking about from the query side of things. More precisely, this is when we start asking questions about our queries.

Timing queries. While it is rare that we have slow queries in RavenDB, it does happen, and when it does, we treat it very seriously. However, in the last few cases that we have seen, the actual problem wasn’t with RavenDB, it was with sending the data back to the client when we had a large result set and large number of documents.

In RavenDB 3.0, we have added the ability to get detailed statistics about what is the cost of the query in every stage of the pipeline.

RavenQueryStatistics stats;
var users = session.Query<Order>("Orders/Totals")
    .Statistics(out stats)
    .Customize(x => x.ShowTimings())
    .Where(x=>x.Company == "companies/11" || x.Employee == "employees/2")
    .ToList();

foreach (var kvp in stats.TimingsInMilliseconds)
{
    Console.WriteLine(kvp.Key + ": " + kvp.Value);
}

Console.WriteLine("Total: " + stats.DurationMilliseconds);

We can now ask RavenDB to explain us its reasoning when doing so:

  • Lucene search: 10
  • Loading documents: 2
  • Transforming results: 0
  • Total: 21

As you can see, the total time for this query is 21 ms, and we have 12 ms accounted for in the actual search time. The rest is network traffic.  This can help you diagnose more easily where the problem is, and hence, how to solve it.

Query timeout and cancellation. As I mentioned, we don’t really have long queries in RavenDB very often. But that is actually is something that happens, and we need a way to deal with that. RavenDB now places a timeout on the amount of time a query gets to run (including querying Lucene, loading documents or transforming the results). A query that doesn’t complete in time will be cancelled, and an error will be returned to the user.

You can also view the currently executing queries and kill a long running query (if you have specified a high timeout, for example).

Explaining queries. Sometimes it is easy to understand why RavenDB has decided to give you documents in a certain order. You asked them sorted by date, and you get them sorted by date. But when you are talking about complex queries, that is much harder. RavenDB will sort the results by default based on relevancy, and that can sometimes be a bit puzzling to understand.

Here is how we can do this:

session.Advanced.DocumentQuery<Order>("Orders/Totals")
                    .Statistics(out stats)
                    .WhereEquals("Company", "companies/11")
                    .WhereEquals("Employee", "employees/3")
                    .ExplainScores()
                    .ToList();

var explanation = stats.ScoreExplantaions["orders/759"];

The result of this would be something that looks like this:

0.6807194 = (MATCH) product of:
  1.361439 = (MATCH) sum of:
    1.361439 = (MATCH) weight(Employee:employees/3 in 469), product of:
      0.4744689 = queryWeight(Employee:employees/3), product of:
        2.869395 = idf(docFreq=127, maxDocs=830)
        0.165355 = queryNorm
      2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 469), product of:
        1 = tf(termFreq(Employee:employees/3)=1)
        2.869395 = idf(docFreq=127, maxDocs=830)
        1 = fieldNorm(field=Employee, doc=469)
  0.5 = coord(1/2)

And if we were to ask for the explanation for orders/237, we will get:

6.047595 = (MATCH) sum of:
  4.686156 = (MATCH) weight(Company:companies/11 in 236), product of:
    0.8802723 = queryWeight(Company:companies/11), product of:
      5.32353 = idf(docFreq=10, maxDocs=830)
      0.165355 = queryNorm
    5.32353 = (MATCH) fieldWeight(Company:companies/11 in 236), product of:
      1 = tf(termFreq(Company:companies/11)=1)
      5.32353 = idf(docFreq=10, maxDocs=830)
      1 = fieldNorm(field=Company, doc=236)
  1.361439 = (MATCH) weight(Employee:employees/3 in 236), product of:
    0.4744689 = queryWeight(Employee:employees/3), product of:
      2.869395 = idf(docFreq=127, maxDocs=830)
      0.165355 = queryNorm
    2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 236), product of:
      1 = tf(termFreq(Employee:employees/3)=1)
      2.869395 = idf(docFreq=127, maxDocs=830)
      1 = fieldNorm(field=Employee, doc=236)

In other words, we can see that orders/237 is ranked much higher than orders/759. That is because is matched both clauses of the query. And a match on Company is a much stronger indication for relevancy because Companies/11 appears only in 10 documents out out 830, while Employees/3 appears in 127 out of 830.

For details about this format, see this presentation, it actually talks about Solr here, but this data comes from Lucene, so it applies to both.

That is it about queries diagnostics, next, we’ll deal with transformers and another important optimization, the staleness reduction system.


Database

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Developers Are Scaling Faster Than Ever: Here’s How Security Can Keep Up
  • The Native Way To Configure Path Aliases in Frontend Projects
  • What Is JHipster?
  • Multi-Stream Joins With SQL

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: