DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  • Database Query Service With OpenAI and PostgreSQL in .NET
  • PostgreSQL 12 End of Life: What to Know and How to Prepare
  • Formulating a Robust Strategy for Storage in Amazon Relational Database Service PostgreSQL Deployments

Trending

  • Blue Skies Ahead: An AI Case Study on LLM Use for a Graph Theory Related Application
  • The Human Side of Logs: What Unstructured Data Is Trying to Tell You
  • Secure by Design: Modernizing Authentication With Centralized Access and Adaptive Signals
  • Kullback–Leibler Divergence: Theory, Applications, and Implications
  1. DZone
  2. Data Engineering
  3. Databases
  4. Auto_Explain: How to Log Slow Postgres Query Plans Automatically

Auto_Explain: How to Log Slow Postgres Query Plans Automatically

Automatically log slow query plans with auto_explain and learn how to fix the problematic PostgreSQL queries that are slowing you down.

By 
Anton Bjorkman user avatar
Anton Bjorkman
·
Nov. 07, 21 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
3.3K Views

Join the DZone community and get the full member experience.

Join For Free

Do you want to know why a PostgreSQL query is slow? Then EXPLAIN ANALYZE is a great starting point. But query plans can depend on other server activity, can take a while to run, and can change over time, so if you want to see the actual execution plans of your slowest queries, auto_explain is the tool you need. In this post, we’ll look into what it does, how to configure it, and how to use those logs to speed up your queries.

What Is Auto_Explain?

Auto_explain is a PostgreSQL extension that allows you to log the query plans for queries slower than a (configurable) threshold. This is incredibly useful for debugging slow queries, especially those that are only sometimes problematic. It is one of the contribution modules, so it can be installed and configured easily on regular PostgreSQL.

Huge thanks to Takahiro Itagaki, the main author behind the first version of auto_explain (commit, thread), Dean Rasheed, whose initial patch and suggestion it was based on, and the many contributors and reviewers to it since.

Which Auto_Explain Parameters Should I Use?

Below we’ll discuss the most important parameters, but please do review the table below, or the official documentation, for more information on the full array of things you can track.

The most important parameter for auto_explain is log_min_duration. By default this is set to -1, which means nothing will be logged – so if we want some logs we need to change it! The default unit is ms, so a setting of 100 will log the query plans for all queries that exceed 100ms. If for some reason you want to log the query plan for every query, you can set this to 0 – but beware, this can have severe performance implications.

Since the queries are already being executed on the server, you probably also want to enable auto_explain.log_analyze. This makes the output equivalent to running EXPLAIN ANALYZE. By default, it also means that per-operation timings are tracked. This comes with some additional overhead, which can be minimized by turning off auto_explain.log_timing (on by default with log_analyze). Naturally, though, per-operation timings are very useful when debugging slow queries! Our internal testing showed acceptable overheads for this, but as always, please test your workload to see if the overhead is acceptable in your case. There is currently limited publicly available information on this topic, but a recent post by the pgMustard team showed that, at least on a small transactional workload, the overhead can be as low as two percent. As they noted, this could be reduced with the auto_explain.sample_rate parameter, at the cost of only tracking a subset of your queries.

The final parameter we’ll discuss in a bit of detail is auto_explain.log_format. The default output is TEXT, which is likely what you’re most familiar with from using EXPLAIN.

Here is a simple example of what auto_explain output in TEXT format can look like:

 
2021-09-10 15:32:04.606 BST [22770] LOG:  duration: 3184.383 ms  plan:

    Query Text: select * from table1 order by column1;

    Sort  (cost=12875.92..13125.92 rows=100000 width=37) (actual time=2703.799..3055.401 rows=100000 loops=1)

      Sort Key: column1

      Sort Method: external merge  Disk: 4696kB

      Buffers: shared hit=837, temp read=587 written=589

      ->  Seq Scan on table  (cost=0.00..1834.01 rows=100000 width=37) (actual time=0.033..27.795 rows=100000 loops=1)

            Buffers: shared hit=834


You can see here that you get the query duration at the beginning, which you may be used to seeing at the end of query plans. You’ll also see the query text, including any parameters.

The popular visualization tools explain.depesz and explain.dalibo both accept query plans in TEXT format, but they also both support JSON format. If some of your team prefer to use tools like PEV and pgMustard, which only support the JSON format, you might wish to set that as the format.

Here is a full list of the auto_explain parameters, and their defaults:

Parameter PostgreSQL defaults
auto_explain.log_min_duration -1
auto_explain.log_analyze Off
auto_explain.log_timing On (with log_analyze)
auto_explain.log_buffers Off
auto_explain.log_verbose Off
auto_explain.log_triggers Off
auto_explain.log_nested_statements Off
auto_explain.log_settings (v12) Off
auto_explain.log_wal (v13) Off
auto_explain.log_format TEXT
auto_explain.log_level LOG
auto_explain.sample_rate 1

Installing Auto_Explain

On vanilla PostgreSQL, you can install auto_explain simply by adding it to one of session_preload_libraries or shared_preload_libraries. The former has the advantages of a) not requiring a restart (but it’ll only be loaded in new sessions) and b) making it possible to enable it only for some users (by setting this parameter with ALTER ROLE SET).

As such, a basic config set-up for auto_explain could look like this:

JSON
 
session_preload_libraries = auto_explain

auto_explain.log_min_duration = 100

auto_explain.log_analyze = true

auto_explain.log_buffers = true

auto_explain.log_format = JSON


If you’re using a different hosting provider, it’s worth checking whether they support auto_explain. 

Loading Auto_Explain Into a Single Session

If you don’t want auto_explain running in sessions automatically, as a superuser you also have the option of loading it into a single session:

LOAD 'auto_explain'; 

This can be incredibly useful for one-off debugging sessions but naturally is unnecessary if you are able to have it running already.

Auto_Explain Limitations and Gotchas

We have mentioned some of these in passing already, but it seems a sensible time to remind ourselves of some of the downsides and limitations of auto_explain.

Firstly, especially when tracking per-operation timings, there can be measurable overhead to using auto_explain. It can be low, even with timings being measured, but as ever it’s worth doing your own testing.

Secondly, auto_explain timings are exclusive of query planning time. Planning time is often tiny on slow queries, but in exceptional cases, it can be responsible for the majority of the issue. As such, bear in mind that these cases may not show in your logs, or if they do, a discrepancy with what you’re seeing in total latency might be to do with the planning time. A manual EXPLAIN ANALYZE will quickly help you spot this.

How to Use the Explain Output To Speed up Queries

Once you have the explain output for your slowest queries, you can now start looking into speeding them up!

You’ll need to get the query plans out of the logs, which you can use pgBadger for if you’re not using a managed service that does this for you.

Reviewing EXPLAIN plans is a huge topic in its own right. The PostgreSQL documentation includes a good but brief introduction to using EXPLAIN, and Thoughbot’s article on reading EXPLAIN ANALYZE is a good next step. If you’d prefer an hour-long talk, EXPLAIN Explained by Josh Berkus was excellent. For more information, Depesz has a series of posts called Explaining the unexplainable, and the pgMustard team has a fairly comprehensive EXPLAIN Glossary.

We hope this gives you everything you need to get started with auto_explain and start speeding up any slow queries you have. If there’s anything else you’d like to know, do get in touch.

Database PostgreSQL

Published at DZone with permission of Anton Bjorkman. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  • Database Query Service With OpenAI and PostgreSQL in .NET
  • PostgreSQL 12 End of Life: What to Know and How to Prepare
  • Formulating a Robust Strategy for Storage in Amazon Relational Database Service PostgreSQL Deployments

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!