DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How to Build a Full-Stack App With Next.js, Prisma, Postgres, and Fastify
  • Apache Spark for the Impatient
  • Schema Change Management Tools: A Practical Overview
  • Providing Enum Consistency Between Application and Data

Trending

  • FIPS 140-3: The Security Standard That Protects Our Federal Data
  • Developers Beware: Slopsquatting and Vibe Coding Can Increase Risk of AI-Powered Attacks
  • How To Develop a Truly Performant Mobile Application in 2025: A Case for Android
  • How Large Tech Companies Architect Resilient Systems for Millions of Users
  1. DZone
  2. Data Engineering
  3. Databases
  4. What Developers Need to Know About Table Partition Maintenance

What Developers Need to Know About Table Partition Maintenance

Learn more on the importance of table partition maintenance.

By 
Denis Magda user avatar
Denis Magda
DZone Core CORE ·
Jul. 06, 22 · Analysis
Likes (2)
Comment
Save
Tweet
Share
6.0K Views

Join the DZone community and get the full member experience.

Join For Free

Table partitioning is a very convenient technique supported by several databases including MySQL, Oracle, PostgreSQL, and YugabyteDB. In the first article of this series, we discussed an application that automates the operations of a large pizza chain. We reviewed how PostgreSQL improves the application’s performance with the partition pruning feature by eliminating unnecessary partitions from the query execution plan. 

In this article, we’ll examine how PostgreSQL’s partition maintenance capabilities can further influence and simplify the architecture of your apps. We’ll again take the pizza chain app as an example, whose database schema comes with the PizzaOrders table. To remind you, the table tracks the order’s progress (table data is explained in the first article):

Pizza orders table

Now, pretend you need to separate the orders for the current, previous, and all other earlier months. So, you go ahead and partition the PizzaOrders by the OrderTime column:

Partitioned tablesAs a result, the original table gets split into three partitioned tables or partitions:

  • Orders_2022_06 - the table keeps all the orders for the current month (June 2022). Suppose that the customer-facing microservices heavily use the data from this partition.
  • Order_2022_05 - the table stores orders for the previous month (May 2022). Assume internal microservices that facilitate short-term planning regularly query this data in combination with the current month’s data.
  • OrdersOthers - the remaining historical data used by the BI tools for strategic planning.

Nice, you can partition data by time, and the database will ensure each microservice queries only the data it needs. But, wait, the current and past months are not static notions. Once the calendar page flips to July 1st, July 2020 will become the current month. But how do you reflect this change at the database level? Let’s talk about partition maintenance techniques.

Partition Maintenance

The structure of your partitions might be dynamic. Quite frequently, you might want to remove partitions holding old data and add new partitions with the new data. That’s the case with our pizza chain. And this maintenance task can be easily fulfilled at the database level with no code changes on the application side.

In PostgreSQL, partitions are regular tables that you can query or alter directly. So, whenever necessary you can use standard DDL commands to CREATE, ATTACH, DETACH, and DROP partitions. 

Creating Original Partitions

First, let’s create the original partitions that we’ve discussed above:

SQL
 
CREATE TYPE status_t AS ENUM('ordered', 'baking', 'delivering', 'yummy-in-my-tummy');

CREATE TABLE PizzaOrders
 (
   id   int,
   status   status_t,
   ordertime   timestamp
 ) PARTITION BY RANGE (ordertime);
 
CREATE TABLE orders_2022_06 PARTITION OF PizzaOrders 
  FOR VALUES FROM('2022-06-01') TO ('2022-07-01');
  
CREATE TABLE orders_2022_05 PARTITION OF PizzaOrders 
  FOR VALUES FROM('2022-05-01') TO ('2022-06-01');
  
CREATE TABLE orders_others PARTITION OF PizzaOrders DEFAULT;

The PARTITION BY RANGE (order_time) clause requests to split the PizzaOrders table using the Range Partitioning method. The resulting partitions will keep the orders based on the value of the ordertime column. For instance, if the ordertime is between '2022-06-01' (inclusive) and '2022-07-01' (exclusive), then a pizza order goes into the current month’s partition (which is orders_2022_06). The orders_others partition is the DEFAULT one as it will keep all the orders that ordertime value doesn’t fit into the range of any other partition.

Second, all the created partitions are regular tables that you can work with using DDL and DML commands. For instance, let’s load sample data and query the current month’s partitioned table directly:

SQL
 
INSERT INTO PizzaOrders VALUES 
(1, 'yummy-in-my-tummy', '2021-12-27 22:00:00'),
(2, 'yummy-in-my-tummy', '2022-05-15 13:00:00'),
(3, 'yummy-in-my-tummy', '2022-05-23 10:00:00'),
(4, 'yummy-in-my-tummy', '2022-06-23 19:00:00'),
(5, 'delivering', '2022-06-24 8:30:00'),
(6, 'baking', '2022-06-24 8:45:00'),
(7, 'baking', '2022-06-24 9:00:00'),
(8, 'ordered', '2022-06-24 10:00:00'); 

SELECT * FROM orders_2022_06 WHERE ordertime BETWEEN '2022_06_20' AND '2022_06_30';

 id |      status       |      ordertime      
----+-------------------+---------------------
  4 | yummy-in-my-tummy | 2022-06-23 19:00:00
  5 | delivering        | 2022-06-24 08:30:00
  6 | baking            | 2022-06-24 08:45:00
  7 | baking            | 2022-06-24 09:00:00
  8 | ordered           | 2022-06-24 10:00:00

It’s certainly handy that we can query partitioned tables directly. However, you don’t want your customer-facing microservices to remember the actual current month and what partition to query. Instead, the microservices will be querying the top-level PizzaOrders table and PostgreSQL will apply the partitioning pruning optimization the following way:

SQL
 
EXPLAIN ANALYZE SELECT * FROM PizzaOrders 
	WHERE ordertime BETWEEN '2022_06_20' AND '2022_06_30';
                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on orders_2022_06 pizzaorders  (cost=0.00..37.75 rows=9 width=16) (actual time=0.010..0.012 rows=5 loops=1)
   Filter: ((ordertime >= '2022-06-20 00:00:00'::timestamp without time zone) AND (ordertime <= '2022-06-30 00:00:00'::timestamp without time zone))
 Planning Time: 0.122 ms
 Execution Time: 0.040 ms

This is the same query but PostgreSQL (and not your application layer) decides which partition keeps the data. The execution plan shows that the query ran against the orders_2022_06 partition, bypassing the others.

However, this ability to work with partitioned tables directly is extremely useful when you need to change the structure of your partitions. Now, assume that tomorrow is July 1st, 2022. You need to add a new partition that will keep the orders for that new current month (July), as well as introduce a few other changes.

Detaching Old Partitions

Let’s first deal with partition orders_2022_05 that presently holds data for the “previous month” (May 2022). You do this because once July becomes the “current month”, June will become the “previous month”, according to the application logic. 

First, let’s remove the May partition from the partitions structure using the DETACH command:

SQL
 
ALTER TABLE PizzaOrders DETACH PARTITION orders_2022_05; 

Once you do this, attempt to read all the records from the PizzaOrders table to confirm there are no records left for May:

SQL
 
SELECT * FROM PizzaOrders;
 id |      status       |      ordertime      
----+-------------------+---------------------
  4 | yummy-in-my-tummy | 2022-06-23 19:00:00
  5 | delivering        | 2022-06-24 08:30:00
  6 | baking            | 2022-06-24 08:45:00
  7 | baking            | 2022-06-24 09:00:00
  8 | ordered           | 2022-06-24 10:00:00
  1 | yummy-in-my-tummy | 2021-12-27 22:00:00

Don’t get scared, the data for May didn’t evaporate! The data is still in the same partitioned table that you can query directly:

SQL
 
SELECT * FROM orders_2022_05;

 id |      status       |      ordertime      
----+-------------------+---------------------
  2 | yummy-in-my-tummy | 2022-05-15 13:00:00
  3 | yummy-in-my-tummy | 2022-05-23 10:00:00

Next, do you remember that we have the orders_others partition that keeps all the orders that the ordertime column doesn’t fit into the ranges of other partitions? Now go ahead and put the records for May there. You can do this by inserting the data into the top-level PizzaOrders table and letting PostgreSQL arrange records across partitions:

SQL
 
 INSERT INTO PizzaOrders (id,status,ordertime) 
 	SELECT detached.id, detached.status, detached.ordertime 
  	FROM orders_2022_05 as detached;

Lastly, you can safely drop the orders_2022_05 partition because you already have a copy of the orders for May in the orders_others partition:

SQL
 
DROP TABLE orders_2022_05;

SELECT tableoid::regclass,* from PizzaOrders 
  ORDER BY id;

    tableoid    | id |      status       |      ordertime      
----------------+----+-------------------+---------------------
 orders_others  |  1 | yummy-in-my-tummy | 2021-12-27 22:00:00
 orders_others  |  2 | yummy-in-my-tummy | 2022-05-15 13:00:00
 orders_others  |  3 | yummy-in-my-tummy | 2022-05-23 10:00:00
 orders_2022_06 |  4 | yummy-in-my-tummy | 2022-06-23 19:00:00
 orders_2022_06 |  5 | delivering        | 2022-06-24 08:30:00
 orders_2022_06 |  6 | baking            | 2022-06-24 08:45:00
 orders_2022_06 |  7 | baking            | 2022-06-24 09:00:00
 orders_2022_06 |  8 | ordered           | 2022-06-24 10:00:00

Attaching New Partitions

Finally, let’s create a partition for July that’s about to become the “current month”, in accordance with the application logic. 

The most straightforward way to do this is by attaching a new partition to the PizzaOrders table:

SQL
 
CREATE TABLE orders_2022_07 PARTITION OF PizzaOrders 
  FOR VALUES FROM('2022-07-01') TO ('2022-08-01');

The name of the new partition is orders_2022_07 and it’s added to the partitions structure:

SQL
 
\d+ PizzaOrders;
                                            Partitioned table "public.pizzaorders"
  Column   |            Type             | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
-----------+-----------------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 id        | integer                     |           |          |         | plain   |             |              | 
 status    | status_t                    |           |          |         | plain   |             |              | 
 ordertime | timestamp without time zone |           |          |         | plain   |             |              | 
Partition key: RANGE (ordertime)
Partitions: orders_2022_06 FOR VALUES FROM ('2022-06-01 00:00:00') TO ('2022-07-01 00:00:00'),
            orders_2022_07 FOR VALUES FROM ('2022-07-01 00:00:00') TO ('2022-08-01 00:00:00'),
            orders_others DEFAULT

Easy, isn’t it? Let’s test the changes by inserting dummy data for July 2022 and checking what partition those records belong to:

SQL
 
INSERT INTO PizzaOrders VALUES 
(9, 'ordered', '2022-07-02 10:00:00'),
(10, 'baking', '2022-07-02 9:50:00'),
(11, 'yummy-in-my-tummy', '2022-07-01 18:10:00');

SELECT tableoid::regclass,* from PizzaOrders 
  ORDER BY id;
    tableoid    | id |      status       |      ordertime      
----------------+----+-------------------+---------------------
 orders_others  |  1 | yummy-in-my-tummy | 2021-12-27 22:00:00
 orders_others  |  2 | yummy-in-my-tummy | 2022-05-15 13:00:00
 orders_others  |  3 | yummy-in-my-tummy | 2022-05-23 10:00:00
 orders_2022_06 |  4 | yummy-in-my-tummy | 2022-06-23 19:00:00
 orders_2022_06 |  5 | delivering        | 2022-06-24 08:30:00
 orders_2022_06 |  6 | baking            | 2022-06-24 08:45:00
 orders_2022_06 |  7 | baking            | 2022-06-24 09:00:00
 orders_2022_06 |  8 | ordered           | 2022-06-24 10:00:00
 orders_2022_07 |  9 | ordered           | 2022-07-02 10:00:00
 orders_2022_07 | 10 | baking            | 2022-07-02 09:50:00
 orders_2022_07 | 11 | yummy-in-my-tummy | 2022-07-01 18:10:00

Done! You could easily change the structure of the partitions by detaching the partition for May and attaching a new one for July. And no changes were necessary on the application side. Our microservices continued to query the PizzaOrders table directly without bothering underlying partitions.

To Be Continued…

Alright, with this article we finished the review of partition pruning and maintenance capabilities that can improve performance and facilitate the design of your application. Check out this PostgreSQL resource to learn more.

In a follow-up article, you’ll learn how to use geo-partitioning to pin pizza orders to a specific geographic location. After all, we’ve been working on the application for a large pizza chain that feeds and delights customers across countries and continents. Stay tuned!

Database application Data (computing) Partition (database) PostgreSQL

Opinions expressed by DZone contributors are their own.

Related

  • How to Build a Full-Stack App With Next.js, Prisma, Postgres, and Fastify
  • Apache Spark for the Impatient
  • Schema Change Management Tools: A Practical Overview
  • Providing Enum Consistency Between Application and Data

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!