DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Full-Stack Observability Essentials: Explore the fundamentals of system-wide observability and key components of the OpenTelemetry standard.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • A Better Web3 Experience: Account Abstraction From Flow (Part 2)
  • Training a Handwritten Digits Classifier in Pytorch With Apache Cassandra Database
  • Mastering Persistence: Why the Persistence Layer Is Crucial for Modern Java Applications
  • .NET Performance Optimization Techniques for Expert Developers

Trending

  • TDD With FastAPI Is Easy
  • Demystifying Enterprise Integration Patterns: Bridging the Gap Between Systems
  • Automated Testing Lifecycle
  • Selecting the Right Automated Tests
  1. DZone
  2. Data Engineering
  3. Databases
  4. What is New in RavenDB 3.0: Query Diagnostics

What is New in RavenDB 3.0: Query Diagnostics

Oren Eini user avatar by
Oren Eini
·
Sep. 28, 14 · Interview
Like (0)
Save
Tweet
Share
5.03K Views

Join the DZone community and get the full member experience.

Join For Free

We talked a lot about the changes we made for indexing, now let us talk about the kind of changes we are talking about from the query side of things. More precisely, this is when we start asking questions about our queries.

Timing queries. While it is rare that we have slow queries in RavenDB, it does happen, and when it does, we treat it very seriously. However, in the last few cases that we have seen, the actual problem wasn’t with RavenDB, it was with sending the data back to the client when we had a large result set and large number of documents.

In RavenDB 3.0, we have added the ability to get detailed statistics about what is the cost of the query in every stage of the pipeline.

RavenQueryStatistics stats;
var users = session.Query<Order>("Orders/Totals")
    .Statistics(out stats)
    .Customize(x => x.ShowTimings())
    .Where(x=>x.Company == "companies/11" || x.Employee == "employees/2")
    .ToList();

foreach (var kvp in stats.TimingsInMilliseconds)
{
    Console.WriteLine(kvp.Key + ": " + kvp.Value);
}

Console.WriteLine("Total: " + stats.DurationMilliseconds);

We can now ask RavenDB to explain us its reasoning when doing so:

  • Lucene search: 10
  • Loading documents: 2
  • Transforming results: 0
  • Total: 21

As you can see, the total time for this query is 21 ms, and we have 12 ms accounted for in the actual search time. The rest is network traffic.  This can help you diagnose more easily where the problem is, and hence, how to solve it.

Query timeout and cancellation. As I mentioned, we don’t really have long queries in RavenDB very often. But that is actually is something that happens, and we need a way to deal with that. RavenDB now places a timeout on the amount of time a query gets to run (including querying Lucene, loading documents or transforming the results). A query that doesn’t complete in time will be cancelled, and an error will be returned to the user.

You can also view the currently executing queries and kill a long running query (if you have specified a high timeout, for example).

Explaining queries. Sometimes it is easy to understand why RavenDB has decided to give you documents in a certain order. You asked them sorted by date, and you get them sorted by date. But when you are talking about complex queries, that is much harder. RavenDB will sort the results by default based on relevancy, and that can sometimes be a bit puzzling to understand.

Here is how we can do this:

session.Advanced.DocumentQuery<Order>("Orders/Totals")
                    .Statistics(out stats)
                    .WhereEquals("Company", "companies/11")
                    .WhereEquals("Employee", "employees/3")
                    .ExplainScores()
                    .ToList();

var explanation = stats.ScoreExplantaions["orders/759"];

The result of this would be something that looks like this:

0.6807194 = (MATCH) product of:
  1.361439 = (MATCH) sum of:
    1.361439 = (MATCH) weight(Employee:employees/3 in 469), product of:
      0.4744689 = queryWeight(Employee:employees/3), product of:
        2.869395 = idf(docFreq=127, maxDocs=830)
        0.165355 = queryNorm
      2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 469), product of:
        1 = tf(termFreq(Employee:employees/3)=1)
        2.869395 = idf(docFreq=127, maxDocs=830)
        1 = fieldNorm(field=Employee, doc=469)
  0.5 = coord(1/2)

And if we were to ask for the explanation for orders/237, we will get:

6.047595 = (MATCH) sum of:
  4.686156 = (MATCH) weight(Company:companies/11 in 236), product of:
    0.8802723 = queryWeight(Company:companies/11), product of:
      5.32353 = idf(docFreq=10, maxDocs=830)
      0.165355 = queryNorm
    5.32353 = (MATCH) fieldWeight(Company:companies/11 in 236), product of:
      1 = tf(termFreq(Company:companies/11)=1)
      5.32353 = idf(docFreq=10, maxDocs=830)
      1 = fieldNorm(field=Company, doc=236)
  1.361439 = (MATCH) weight(Employee:employees/3 in 236), product of:
    0.4744689 = queryWeight(Employee:employees/3), product of:
      2.869395 = idf(docFreq=127, maxDocs=830)
      0.165355 = queryNorm
    2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 236), product of:
      1 = tf(termFreq(Employee:employees/3)=1)
      2.869395 = idf(docFreq=127, maxDocs=830)
      1 = fieldNorm(field=Employee, doc=236)

In other words, we can see that orders/237 is ranked much higher than orders/759. That is because is matched both clauses of the query. And a match on Company is a much stronger indication for relevancy because Companies/11 appears only in 10 documents out out 830, while Employees/3 appears in 127 out of 830.

For details about this format, see this presentation, it actually talks about Solr here, but this data comes from Lucene, so it applies to both.

That is it about queries diagnostics, next, we’ll deal with transformers and another important optimization, the staleness reduction system.


Database

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • A Better Web3 Experience: Account Abstraction From Flow (Part 2)
  • Training a Handwritten Digits Classifier in Pytorch With Apache Cassandra Database
  • Mastering Persistence: Why the Persistence Layer Is Crucial for Modern Java Applications
  • .NET Performance Optimization Techniques for Expert Developers

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: