Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Tuning PostgreSQL Autovacuum to Prevent Table Bloat

DZone's Guide to

Tuning PostgreSQL Autovacuum to Prevent Table Bloat

Table bloat is fairly common in PostgreSQL, but with just some careful analysis and tweaking, you can keep your tables bloat-free.

· Database Zone ·
Free Resource

Slow SQL Server? These SentryOne resources share tips and tricks for not only troubleshooting SQL Server performance issues, but also preventing them before they hit your production environment.

If you're a PostgreSQL database administrator, you know that "table bloat" in your database can be a real issue. But what causes bloat? In PostgreSQL, the culprit is often Multi-Version Concurrency Control or (MVCC).

PostgreSQL's implementation of MVCC ensures that a transaction against a database will return only data that's been committed, in a snapshot, even if other processes are trying to modify that data. Imagine a database with millions of large tuples (commonly called "rows") in a table. If the table has many indices and constant read/writes happening, this could lead to huge amounts of bloat and ultimately performance issues. One crucial process to help prevent bloat is autovacuum.

At a high level, vacuuming is used to free up dead tuples in a table so they can be reused. It also helps you avoid transaction ID wraparound and improves statistics updates and query optimizations.

Let me walk you through a full use case and show a real example of how we use autovacuum settings to control bloat in our databases at New Relic.

A Quick Note About Transactions

In order for Postgres to know which transaction data should be in the result set for your query, the snapshot makes note of transaction information.

Every transaction (for example, SELECT, UPDATE, INSERT, and BEGIN) are assigned an incremental transaction ID (XID). These transaction IDs — along with some other processes and logic — are used to determine which tuples are visible to your specific transaction.

Essentially, if your transaction ID is x, you will see data with all transaction IDs leading up to x.

Creating the Table for Our Example

Let's start by creating a simple table for our example, called newrelic:

test_database=# CREATE TABLE newrelic ( col1 int );
CREATE TABLE

Now, let's insert some tuples. Remember, each insert is a transaction:

test_database=# INSERT INTO newrelic VALUES (1);
INSERT 0 1
test_database=# INSERT INTO newrelic VALUES (2);
INSERT 0 1
test_database=# INSERT INTO newrelic VALUES (3);
INSERT 0 1
test_database=# INSERT INTO newrelic VALUES (4);
INSERT 0 1
test_database=# INSERT INTO newrelic VALUES (5);
INSERT 0 1

You can wrap multiple inserts into a single transaction with BEGIN and COMMIT:

test_database=# BEGIN;
BEGIN
test_database=# INSERT INTO newrelic SELECT generate_series(6,10);
INSERT 0 5
test_database=# commit;
COMMIT

Here we can see the ten tuples we inserted into the table, along with some hidden system columns:

test_database=# SELECT xmin, xmax, * FROM newrelic;

xmin | xmax | col1
-----+------+------
1261 |    0 |    1
1262 |    0 |    2
1263 |    0 |    3
1264 |    0 |    4
1265 |    0 |    5
1266 |    0 |    6
1266 |    0 |    7
1266 |    0 |    8
1266 |    0 |    9
1266 |    0 |   10

As you can see, values 1 through 5 (in the col1 column) have unique transaction IDs (represented in the xmin column)-they were the result of individual INSERT statements, made one after the other. The tuples with values of 6 through 10 share the same transaction ID of 1266-they were all part of the one transaction we created with the BEGIN and COMMIT statements.

Ok. But, what the heck does this have to do with autovacuum? Just hang with me.

How the Table Bloats

In PostgreSQL database, the heap is a file containing a list of variable sized records, in no particular order, that points to the location of a row within a page. The pointer to the location is called the CTID.

To view the heap without having to read the raw data from the file, we need to create the following extension inside of our database:

CREATE extension pageinspect;

Now, we can inspect the heap for our newly created table and tuples:

SELECT * FROM heap_page_items(get_raw_page('newrelic', 0));

 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 |   1261 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 |   1262 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 |   1263 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 |   1264 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8032 |        1 |     28 |   1265 |      0 |        0 | (0,5)  |           1 |       2304 |     24 |        |       | \x05000000
  6 |   8000 |        1 |     28 |   1266 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   7968 |        1 |     28 |   1266 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7936 |        1 |     28 |   1266 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7904 |        1 |     28 |   1266 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7872 |        1 |     28 |   1266 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000

The table above shows 10 entries with a few columns:

  • lp is the ID of the row/tuple
  • t_xmin is the transaction ID
  • t_ctid is the pointer
  • t_data is the actual data

Clearly, our pointer for each row is pointing to itself as determined by the form (page,tupleid). Pretty straightforward.

Now, let's perform an update transaction on the table; we'll change the row with the value of 5 to 20, then 30, then back to 5:

test_database=# UPDATE newrelic SET col1 = 20 WHERE col1 = 5;
UPDATE 1

test_database=# UPDATE newrelic SET col1 = 30 WHERE col1 = 20;
UPDATE 1

test_database=# UPDATE newrelic SET col1 = 5 WHERE col1 = 30;
UPDATE 1

These three changes took place under three different transactions. What has this done to our table? We changed the values for a column 3 times but never added or deleted any tuples. So we should still have 10 tuples, right? Let's count:

Count
-------
10
(1 row)

Ok, but what did this do to the heap?

test_database=# SELECT * FROM heap_page_items(get_raw_page('newrelic', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 |   1261 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 |   1262 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 |   1263 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 |   1264 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8032 |        1 |     28 |   1265 |   1267 |        0 | (0,11) |       16385 |       1280 |     24 |        |       | \x05000000
  6 |   8000 |        1 |     28 |   1266 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   7968 |        1 |     28 |   1266 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7936 |        1 |     28 |   1266 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7904 |        1 |     28 |   1266 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7872 |        1 |     28 |   1266 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
 11 |   7840 |        1 |     28 |   1267 |   1268 |        0 | (0,12) |       49153 |       9472 |     24 |        |       | \x14000000
 12 |   7808 |        1 |     28 |   1268 |   1269 |        0 | (0,13) |       49153 |       9472 |     24 |        |       | \x1e000000
 13 |   7776 |        1 |     28 |   1269 |      0 |        0 | (0,13) |       32769 |      10496 |     24 |        |       | \x05000000

Where I thought I had ten tuples, I now have 13?

Let's examine our three separate update transactions (1267, 1268, and 1269) to see what's happening with the heap:

t_xmin (1267


UPDATE newrelic SET col1 = 20 WHERE col1 = 5;
  • Logically DELETE tuple ID 5
  • Physically INSERT tuple ID 11
  • UPDATE tuple ID 5 pointer (t_tcid) to point to tuple ID 11

Tuple ID 5 becomes a dead row when its t_xmax gets set to the new transaction ID initiated by 1267.

t_xmin (1268)

  • UPDATE newrelicSET col1 = 30 WHERE col1 = 20;
  • Logically DELETE tuple ID 11
  • Physically INSERT tuple ID 12
  • UPDATE tuple ID 11 pointer (t_tcid) to point to tuple ID 12

Once again, Tuple ID 11 becomes a dead row when its t_xmax gets set to the new transaction ID initiated by 1268.

t_xmin (1269)

  • UPDATE newrelic SET col1 = 5 WHERE col1 = 30;
  • Logically DELETE tuple ID 12
  • Physically INSERT tuple ID 13
  • UPDATE tuple ID 12 pointer (t_tcid) to point to tuple ID 13

Tuple ID 13 is live and visible to other transactions. It has no t_xmax and the t_ctid(0,13) points to itself.

We have not really added or deleted tuples in our table. We still see 10 in the count, but our heap has increased by three.

At a very high level, this is how PostgreSQL implements MVCC and why we have table bloat in our heap. In essence, changes to data result in a new row reflecting the latest state of the data. The old tuples need to be cleaned or reused for efficiency.

Vacuuming the Table

The best way to solve table bloat is to use PostgreSQL's vaccuum function. So, lets manually vacuum our test table and see what happens:

test_database=# VACUUM newrelic;
VACUUM

Now, let's look at our heap again:

test_database=# SELECT * FROM heap_page_items(get_raw_page('newrelic', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 |   1261 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 |   1262 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 |   1263 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 |   1264 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |   8032 |        1 |     28 |   1265 |   1267 |        0 | (0,11) |       16385 |       1280 |     24 |        |       | \x05000000
  6 |   8000 |        1 |     28 |   1266 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   7968 |        1 |     28 |   1266 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7936 |        1 |     28 |   1266 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7904 |        1 |     28 |   1266 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7872 |        1 |     28 |   1266 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
 11 |   7840 |        1 |     28 |   1267 |   1268 |        0 | (0,12) |       49153 |       9472 |     24 |        |       | \x14000000
 12 |   7808 |        1 |     28 |   1268 |   1269 |        0 | (0,13) |       49153 |       9472 |     24 |        |       | \x1e000000
 13 |   7776 |        1 |     28 |   1269 |      0 |        0 | (0,13) |       32769 |      10496 |     24 |        |       | \x05000000

After vacuuming, tuples 5, 11, and 12 are now freed up for reuse. So, let's insert another tuple, with the value of 11 and see what happens:

test_database=# INSERT INTO newrelic VALUES(11);
INSERT 0 1

Now let's look at the heap again:

test_database=# SELECT * FROM heap_page_items(get_raw_page('newrelic', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |   t_data   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+------------
  1 |   8160 |        1 |     28 |   1261 |      0 |        0 | (0,1)  |           1 |       2304 |     24 |        |       | \x01000000
  2 |   8128 |        1 |     28 |   1262 |      0 |        0 | (0,2)  |           1 |       2304 |     24 |        |       | \x02000000
  3 |   8096 |        1 |     28 |   1263 |      0 |        0 | (0,3)  |           1 |       2304 |     24 |        |       | \x03000000
  4 |   8064 |        1 |     28 |   1264 |      0 |        0 | (0,4)  |           1 |       2304 |     24 |        |       | \x04000000
  5 |     13 |        2 |      0 |        |        |          |        |             |            |        |        |       | 
  6 |   8032 |        1 |     28 |   1266 |      0 |        0 | (0,6)  |           1 |       2304 |     24 |        |       | \x06000000
  7 |   8000 |        1 |     28 |   1266 |      0 |        0 | (0,7)  |           1 |       2304 |     24 |        |       | \x07000000
  8 |   7968 |        1 |     28 |   1266 |      0 |        0 | (0,8)  |           1 |       2304 |     24 |        |       | \x08000000
  9 |   7936 |        1 |     28 |   1266 |      0 |        0 | (0,9)  |           1 |       2304 |     24 |        |       | \x09000000
 10 |   7904 |        1 |     28 |   1266 |      0 |        0 | (0,10) |           1 |       2304 |     24 |        |       | \x0a000000
 11 |   7840 |        1 |     28 |   1270 |      0 |        0 | (0,11) |           1 |       2048 |     24 |        |       | \x0b000000
 12 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |       | 
 13 |   7872 |        1 |     28 |   1269 |      0 |        0 | (0,13) |       32769 |      10496 |     24 |        |       | \x05000000

Our new tuple (with transaction ID 1270) reused tuple 11, and now the tuple 11 pointer (0,11) is pointing to itself. This explains why vacuum or autovacuum is so important. Instead of growing the heap, it reused space previously occupied by a dead tuple.

How to Fine-Tune Autovacuum

Autovacuum is PostgreSQL deciding when to vacuum your tables for you so you don't need to worry about it. It's a setting you can tune however you want.

For example, the New Relic database team had a legacy database with pretty slow performance, and we weren't vacuuming the database as often as we should have been.

In fact, autovacuum and analyze functions weren't even happening as often we wanted. There were two parameters in our postgresql.conf that were set to the default values:

  • autovacuum_analyze_scale_factor = 0.1;
  • autovacuum_vacuum_scale_factor = 0.2;

The default values were far too small for the number of transactions our database was processing. For example, a key table in this database would not get autovacuumed until 11,773,933 tuples were updated or deleted:

relid  |        relname         | n_tup_ins | n_tup_del | n_tup_upd | n_dead_tup | reltuples 

--------+------------------------+-----------+-----------+-----------+------------+-----------


  17548 | events                 |  18322861 |         0 |  42930018 |     356986 |  58869416javascript:void(0)

This is how we reach the above number:

  • When ( 50 + ( .2 * n tuples ) ) were updated or deleted, PostgreSQL would trigger a vacuum.
  • When ( 50 + ( .1 * n tuples ) ) were inserted, updated, or deleted, PostgreSQL would trigger an analyze.

We wanted to be more aggressive here, so we made the following changes in postgresql.conf:

  • autovacuum_analyze_scale_factor = 0.002;
  • autovacuum_vacuum_scale_factor = 0.001;

These quick parameter changes made sure PostgreSQL ran its autovacuum and analyze functions much more often.

  • When ( 50 + ( .2 * n tuples ) ) were updated or deleted, PostgreSQL would trigger a vacuum.
  • When ( 50 + ( .1 * n tuples ) ) were inserted, updated, or deleted, PostgreSQL would trigger an analyze.

We wanted to be more aggressive here, so we made the following changes in postgresql.conf:

  • autovacuum_analyze_scale_factor = 0.002;
  • autovacuum_vacuum_scale_factor = 0.001;

These quick parameter changes made sure PostgreSQL ran its autovacuum and analyze functions much more often.

Conclusion

Table bloat is fairly common in PostgreSQL, but with just some careful analysis and tweaking, you can keep your tables bloat-free. If you have a database that seems to be missing its performance marks, take a look at how often you're running the autovacuum and analyze functions — those settings may be all you need to tweak.

Database monitoring tools letting you down? See how SentryOne empowers Enterprises to go faster.

Topics:
database ,tutorial ,autovacuum ,postgresql ,transactions

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}