DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  • Database Query Service With OpenAI and PostgreSQL in .NET
  • PostgreSQL 12 End of Life: What to Know and How to Prepare
  • Formulating a Robust Strategy for Storage in Amazon Relational Database Service PostgreSQL Deployments

Trending

  • Immutable Secrets Management: A Zero-Trust Approach to Sensitive Data in Containers
  • FIPS 140-3: The Security Standard That Protects Our Federal Data
  • Scaling DevOps With NGINX Caching: Reducing Latency and Backend Load
  • Beyond Simple Responses: Building Truly Conversational LLM Chatbots
  1. DZone
  2. Data Engineering
  3. Databases
  4. Configuring Memory for Postgres

Configuring Memory for Postgres

If you're working with Postgres, ensuring it's configured correctly is important, especially when it comes to memory. Read on for some advice and a gotcha.

By 
Craig Kerstiens user avatar
Craig Kerstiens
·
Jun. 14, 18 · Presentation
Likes (2)
Comment
Save
Tweet
Share
10.7K Views

Join the DZone community and get the full member experience.

Join For Free

work_mem is perhaps the most confusing setting within Postgres. work_mem is a configuration within Postgres that determines how much memory can be used during certain operations. At its surface, the work_mem setting seems simple: after all, work_mem just specifies the amount of memory available to be used by internal sort operations and hash tables before writing data to disk. And yet, leaving work_mem unconfigured can bring on a host of issues. What perhaps is more troubling, though, is when you receive an out of memory error on your database and you jump in to tune work_mem, only for it to behave in an un-intuitive manner.

Setting Your Default Memory

The work_mem value defaults to 4MB in Postgres, and that’s likely a bit low. This means that per Postgres, activity (each join, some sorts, etc.) can consume 4MB before it starts spilling to disk. When Postgres starts writing temp files to disk, obviously things will be much slower than in memory. You can find out if you’re spilling to disk by searching for temporary file within your PostgreSQL logs when you have log_temp_files enabled. If you see temporary file, it can be worth increasing your work_mem.

On Citus Cloud (our fully-managed database as a service that scales out Postgres horizontally), we automatically tune work_mem based on the overall memory available to the box. Our tuning is based on the years of experience of what we’ve seen work for a variety of production Postgres workloads, coupled with statistics to compute variations based on cluster sizing.

It’s tough to get the right value for work_mem perfect, but often, a sane default can be something like 64 MB if you’re looking for a one size fits all answer.

It’s Not Just About the Memory for Queries

Let’s use an example to explore how to think about optimizing your work_memsetting.

Say you have a certain amount of memory, say 10 GB. If you have 100 running Postgres queries, and each of those queries has a 10 MB connection overhead, then 100*10 MB (1 GB) of memory is taken up by the 100 connections, which leaves you with 9GB of memory.

With 9 GB of memory remaining, say you give 90 MB to work_mem for the 100 running queries. But wait, it’s not that simple. Why? Well, work_mem isn’t set on a per-query basis, rather, it’s set based on the number of sort/hash operations. But how many shorts/hashes and joins happen per query? Now that is a complicated question. A complicated question made more complicated if you have other processes that also consume memory, such as autovacuum.

Let’s reserve a little for maintenance tasks and for vacuum and we’ll be okay as long as we limit our connections right? Not so fast my friend.

Postgres now has parallel queries. If you’re using Citus for parallelism you’ve had this for a while, but now you have it on single node Postgres as well. What this means is on a single query, you can have multiple processes running and performing work. This can result in some significant improvements in speed of queries, but each of those running processes can consume the specified amount of work_mem. With our 64 MB default and 100 connections, we could now have each of those running a query per each core consuming far more memory than we anticipated.

More work_mem, More Problems

So, we can see that getting it perfect is a little more work than ideal. Let’s go back a little and try this more simply. We can start work_mem small at say, 16 MB, and gradually increase work_mem when we see temporary file. But why not give each query as much memory as it would like? If we were to just say each process could consume up to 1 GB of memory, what’s the harm? Well, the other extreme out there is that queries begin consuming too much memory, more than you have available on your box. When that happens, you get 100 queries that have 5 different sort operations and a few hash joins in them. It’s in fact very possible to exhaust all the memory available to your database.

When you consume more memory than is available on your machine, you can start to see out of memory errors within your Postgres logs, or in worse cases, the OOM killer can start to randomly kill running processes to free up memory. An out of memory error in Postgres simply errors on the query you’re running, whereas the OOM killer in Linux begins killing running processes, which in some cases might even include Postgres itself.

When you see an out of memory error, you either want to increase the overall RAM on the machine itself by upgrading to a larger instance, or you want to decrease the amount of memory that work_mem uses. Yes, you read that right: out-of-memory, it’s better to decrease work_mem instead of increase since that is the amount of memory that can be consumed by each process and too many operations are leveraging up to that much memory.

General Guidance for work_mem

While you can continually tune and tweak work_mem, a couple of broad guidelines for pairing to your workload can generally get you into a good spot:

If you have a number of short running queries that run very frequently and perform simple lookups and joins, then maintaining a lower work_mem  is ideal. In this case, you get diminishing returns by allowing it to be significantly higher because it’s simply unused. If you’re workload is relatively few active queries at a time that are doing very complex sorts and joins, then granting more memory to prevent things from spilling can give you great returns.

Happy Database Tuning

Postgres powerful feature set and flexibility means you have a lot of knobs you can turn and levers you can pull in tuning it. Postgres is often used for embedded systems, time series data, OLTP, and OLAP as well. This flexibility can often mean an overwhelming set of options when tuning. On Citus Cloud, we’ve configured this to be suitable for most workloads we see. Think of it as one size fits most, and then when you need to, you’re able to customize. If you’re not running on Citus Cloud, consider leveraging pgtune to help you get to a good starting point.

Memory (storage engine) PostgreSQL Database

Published at DZone with permission of Craig Kerstiens, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  • Database Query Service With OpenAI and PostgreSQL in .NET
  • PostgreSQL 12 End of Life: What to Know and How to Prepare
  • Formulating a Robust Strategy for Storage in Amazon Relational Database Service PostgreSQL Deployments

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: