DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Using PostgreSQL pgoutput Plugin for Change Data Capture With Debezium
  • Database Query Service With OpenAI and PostgreSQL in .NET
  • PostgreSQL 12 End of Life: What to Know and How to Prepare
  • Controlling Access to Google BigQuery Data

Trending

  • Performance Optimization Techniques for Snowflake on AWS
  • AI’s Role in Everyday Development
  • Fixing Common Oracle Database Problems
  • Scaling Mobile App Performance: How We Cut Screen Load Time From 8s to 2s
  1. DZone
  2. Data Engineering
  3. Databases
  4. What Is Incremental View Maintenance (IVM)?

What Is Incremental View Maintenance (IVM)?

Incremental View Maintenance (IVM) provides a method for keeping materialized views current by calculating and applying only incremental changes.

By 
hubert dulay user avatar
hubert dulay
·
Mar. 04, 24 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
2.8K Views

Join the DZone community and get the full member experience.

Join For Free

Incremental View Maintenance (IVM) provides a method for keeping materialized views current by calculating and applying only the incremental changes, as opposed to the complete recomputation of contents performed by the REFRESH MATERIALIZED VIEW command.

Materialized View in PostgreSQL

A materialized view is a database object that stores the result set of a query as a physical table, persisting the computed data for improved query performance. In contrast to regular views, materialized views contain actual data rather than merely defining a query.

Materialized View in PostgreSQL

These views are advantageous for complex queries or aggregations involving large datasets, as they reduce computational overhead by storing precomputed results. Materialized views contribute to faster data retrieval, optimize specific queries, and support offline access, making them valuable in scenarios such as data warehousing, business intelligence, and decision support systems.

In PostgreSQL, you can create a materialized view using this syntax. For example:

SQL
 
create materialized view rental_customer as
select
    r.*,
    c.first_name,
    c.last_name
from
    customer c
    join rental r on c.customer_id = r.customer_id;


What you get is a table that is enriched with a customer’s first and last names.

SQL
 
dvdrental=# \d rental_customer;

Materialized view "public.rental_customer"
    Column    |            Type             |
--------------+-----------------------------+
 rental_id    | integer                     |
 rental_date  | timestamp without time zone |
 inventory_id | integer                     |
 customer_id  | smallint                    |
 return_date  | timestamp without time zone |
 staff_id     | smallint                    |
 last_update  | timestamp without time zone |
 first_name   | character varying(45)       |
 last_name    | character varying(45)       |


If we modify the customer or rental table, the materialized view is NOT updated with the latest changes. You must execute a refresh command to force the materialized view to recompute the entire dataset.

SQL
 
REFRESH MATERIALIZED VIEW rental_customer;


Two obvious disadvantages with materialized views in Postgres:

  1. It needs to be manually updated. Queries against it may not be fresh.
  2. When the REFRESH is invoked, the entire dataset needs to be reprocessed.

These disadvantages limit materialized views from real-time use cases. IVM can help by providing a method for keeping materialized views current.

PG_IVM Postgres Extension

IVM isn’t a feature that comes with Postgres. It’s instead available as a Postgres extension called pg_ivm. This extension can be cloned from here. Build the project and install it in Postgres with this command.

SQL
 
CREATE EXTENSION pg_ivm;

-- Create the IVM and select from the materialized view at the same time
select * from create_immv(
  'customer_count', 
  'select count(*) from customer'
); 

ALTER TABLE public.customer_count REPLICA IDENTITY DEFAULT;

insert into customer (store_id, first_name, last_name, address_id, active) values( 1, 'foo', 'bar', 5, 1);

-- see updated materialized view
select * from customer_count;


(You can read more about replica identity in Postgres here.)

Automatic maintenance features ensure that materialized views stay up-to-date with changes in the underlying data. pg_ivm does this incrementally.

Incremental updates refer to a method of modifying or refreshing data in a system by making only the necessary changes or additions rather than recomputing and updating the entire dataset. This approach is particularly valuable in scenarios where the overall dataset is large and frequent updates occur. Instead of processing and applying changes to the entire dataset, incremental updates identify and apply only the modifications that have occurred since the last update. This targeted updating process minimizes computational resources and reduces the time required to maintain data consistency.

Despite IVM’s benefits, considerations such as storage space and update latency must be weighed when using IVM materialized views in a database application. You may need a second Postgres that can be scaled separately from the primary instance. Alternatively, you can enable the view and consume the results externally.

Change Data Capture (CDC)

We can take our original materialized view that enriches rentals with customer information and instead create an IVM using pg_ivm and make it available via CDC.

SQL
 
select * create_immv('rental_customer','select
    r.*,
    c.first_name,
    c.last_name
from
    rental r 
    join customer c on c.customer_id = r.customer_id');

ALTER TABLE public.rental_customer REPLICA IDENTITY DEFAULT;

-- Needed if you're using Airbyte
CREATE PUBLICATION airbyte_publication FOR ALL TABLES;


Don’t forget to set the REPLICA IDENTITY.

This will allow CDC solutions like the Debezium server, Striim, or Airbyte to capture changes to the IVM materialized view and deliver it to an analytical system like Apache Pinot with UPSERT capabilities.

Flink Postgres CDC Connector

You can now capture the enriched Rental data via CDC using Ververica’s Postgres CDC connector (download here).

SQL
 
CREATE TABLE pgrental_customer (
    rental_id int,
    rental_date timestamp(3),
    inventory_id int,
    customer_id int,
    return_date timestamp(3),
    staff_id int,
    last_update timestamp(3),
    first_name string,
    last_name string
) WITH (
    'connector' = 'postgres-cdc', -- postgres cdc connector
    'hostname' = 'localhost',
    'port' = '5432',
    'username' = 'postgres',
    'password' = 'postgres',
    'database-name' = 'dvdrental',
    'schema-name' = 'public',
    'table-name' = 'rental_customer',
    'slot.name' = 'pgrental_customer',
    'decoding.plugin.name'='pgoutput'
);


Does IVM Make Postgres A Streaming Database?

Not completely.

One main difference is streaming databases have the ability to consume streams from a streaming platform like Kafka and represent them as tables. Another difference is the streaming database’s ability to process events and align them by time before performing a join or aggregation, all while maintaining consistency. In IVM, time is implied, which is much easier to reason with.

Change data capture Database Materialized view PostgreSQL

Published at DZone with permission of hubert dulay. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Using PostgreSQL pgoutput Plugin for Change Data Capture With Debezium
  • Database Query Service With OpenAI and PostgreSQL in .NET
  • PostgreSQL 12 End of Life: What to Know and How to Prepare
  • Controlling Access to Google BigQuery Data

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!