DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Setting Up Failover Slots in PostgreSQL-17
  • PostgreSQL Performance Metrics
  • Building a Real-Time Change Data Capture Pipeline With Debezium, Kafka, and PostgreSQL
  • Comparing Managed Postgres Options on The Azure Marketplace

Trending

  • Using Python Libraries in Java
  • Bridging UI, DevOps, and AI: A Full-Stack Engineer’s Approach to Resilient Systems
  • Developers Beware: Slopsquatting and Vibe Coding Can Increase Risk of AI-Powered Attacks
  • Proactive Security in Distributed Systems: A Developer’s Approach
  1. DZone
  2. Data Engineering
  3. Databases
  4. PostgreSQL BiDirectional Replication

PostgreSQL BiDirectional Replication

Previously we ran Debezium in Embedded mode. In this blog, we shall focus on replication and, more specifically, bidirectional replication.

By 
Emmanouil Gkatziouras user avatar
Emmanouil Gkatziouras
DZone Core CORE ·
Jun. 28, 24 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
10.8K Views

Join the DZone community and get the full member experience.

Join For Free

As you can understand from my previous blogs I am really into PostgreSQL.

PostgreSQL

Previously we ran Debezium in Embedded mode. Behind the scenes, Debezium consumes the changes that were committed to the transaction log. This happens by utilizing the logical decoding feature of PostgreSQL.

In this blog, we shall focus on replication and more specifically bidirectional replication. To achieve bidirectional replication in PostgreSQL we need the module pglogical. You might wonder about the difference between logical decoding and pglogical. Essentially, logical decoding has its origins in PgLocigal. View PgLocial as a more featureful module while logical decoding is embedded in a PostgreSQL distribution.

We will create a custom PostgreSQL Docker image and install PgLogical.

Shell
 
# Use the official PostgreSQL image as base
FROM postgres:15
USER root
RUN apt-get update; apt-get install postgresql-15-pglogical -y
USER postgres


Also, we need to have a PostgreSQL configuration that will enable PgLogical replication and conflict resolution.

Shell
 
listen_addresses = '*'
port = 5432
max_connections = 20
shared_buffers = 128MB
temp_buffers = 8MB
work_mem = 4MB
wal_level = logical
max_wal_senders = 3
track_commit_timestamp = on
shared_preload_libraries = 'pglogical'
pglogical.conflict_resolution = 'first_update_wins'


Let’s break this down. We added pglogical and we enabled track_commit_timestamp. By enabling this parameter PostgreSQL tracks the commit time of transactions. This will be crucial for the conflict resolution strategy.

Now let’s see the conflict resolution. We selected ‘first_update_wins’; therefore, in case of two transactions operating on the same row, the transaction that finished first will be the one to be considered.

Bidirectional replication is set up upon a table. Since we use Docker we shall provide an initialization script to PostgreSQL.

The script will:

  • Enable pglogical
  • Create the table
  • Add a target node
  • Insert the row we shall run tests upon
PLSQL
 
#!/bin/bash
set -e
 
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
  ALTER SYSTEM RESET shared_preload_libraries;
  CREATE EXTENSION pglogical;
 
  create schema test_schema;
  create table test_schema.employee(
          id  SERIAL PRIMARY KEY,
          firstname   TEXT    NOT NULL,
          lastname    TEXT    NOT NULL,
          email       TEXT    not null,
          age         INT     NOT NULL,
          salary         real,
          unique(email)
      );
 
  SELECT pglogical.create_node(
      node_name := '$TARGET',
      dsn := 'host=$TARGET port=5432 dbname=$POSTGRES_DB user=$POSTGRES_USER password=$POSTGRES_PASSWORD');
 
  SELECT pglogical.replication_set_add_table('default', 'test_schema.employee', true);
 
  insert into test_schema.employee (id,firstname,lastname,email,age,salary) values (1,'John','Doe 1','john1@doe.com',18,1234.23);
 
 
EOSQL


Let’s create the instances now using Docker Compose.

YAML
 
version: '3.1'
 
services:
  postgres-a:
    build: ./pglogicalimage
    restart: always
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      TARGET: postgres-b
    volumes:
      - ./config/postgresql.conf:/etc/postgresql/postgresql.conf
      - ./init:/docker-entrypoint-initdb.d
    command:
      - "-c"
      - "config_file=/etc/postgresql/postgresql.conf"
    ports:
      - 5431:5432
  postgres-b:
    build: ./pglogicalimage
    restart: always
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
      TARGET: postgres-a
    volumes:
      - ./config/postgresql.conf:/etc/postgresql/postgresql.conf
      - ./init:/docker-entrypoint-initdb.d
    command:
      - "-c"
      - "config_file=/etc/postgresql/postgresql.conf"
    ports:
      - 5432:5432


We can get our instances up and running by issuing:

docker compose up

Docker Compose V2 is out there with many good features, you can find more about it in the book I authored: A Developer’s Essential Guide to Docker Compose.

Since both instances are up and running we need to enable the replication. Therefore we shall subscribe the nodes to each other.

Execute on the first node:

PLSQL
 
SELECT pglogical.create_subscription(
  subscription_name := 'postgres_b',
  provider_dsn := 'host=postgres-b port=5432 dbname=postgres user=postgres password=postgres',
  synchronize_data := false,
  forward_origins := '{}' );


Execute at the second node:

PLSQL
 
SELECT pglogical.create_subscription(
  subscription_name := 'postgres_a',
  provider_dsn := 'host=postgres-a port=5432 dbname=postgres user=postgres password=postgres',
  synchronize_data := false,
  forward_origins := '{}' );


You can use any PostgreSQL client that suits you. Alternatively, you can just use the psql client that comes packaged with the Docker Images.

For example:

Login to the first node:

 docker compose exec postgres-a psql  --username postgres --dbname postgres

Login to the second node:

docker compose exec postgres-b psql  --username postgres --dbname postgres

Let’s see how conflict resolution will work now.

On the first node, we shall run the following snippet:

PLSQL
 
BEGIN;
UPDATE test_schema.employee SET lastname='first wins';
 
#before committing start transaction on postgres-b
 
COMMIT;


Don’t press commit immediately, instead take the time and before you commit the transaction start the following transaction on the second node.

PLSQL
 
BEGIN;
UPDATE test_schema.employee SET lastname='second looses';
 
#make sure transaction on node postgres-a is committed first.
 
COMMIT;


This transaction will be committed after the transaction that takes place in postgres-a.

Let’s check the logs on postgres-a-1: 

PLSQL
 
postgres-a-1  | 2024-05-01 07:10:45.128 GMT [70] LOG:  CONFLICT: remote UPDATE on relation test_schema.employee (local index employee_pkey). Resolution: keep_local.
postgres-a-1  | 2024-05-01 07:10:45.128 GMT [70] DETAIL:  existing local tuple {id[int4]:1 firstname[text]:John lastname[text]:first wins email[text]:john1@doe.com age[int4]:18 salary[float4]:1234.23} xid=748,origin=0,timestamp=2024-05-01 07:10:42.269227+00; remote tuple {id[int4]:1 firstname[text]:John lastname[text]:second looses email[text]:john1@doe.com age[int4]:18 salary[float4]:1234.23} in xact origin=1,timestamp=2024-05-01 07:10:45.125791+00,commit_lsn=0/16181C0
postgres-a-1  | 2024-05-01 07:10:45.128 GMT [70] CONTEXT:  apply UPDATE from remote relation test_schema.employee in commit before 0/16181C0, xid 747 committed at 2024-05-01 07:10:45.125791+00 (action #2) from node replorigin 1


The transaction that took place on postgres-a finished first. Postgres-a received the replication data from the transaction of node postgres-b. A comparison was issued on the commit timestamp because the commit timestamp of the transaction on postgres-a was earlier the resolution was to keep the local changes.

We can see the reverse on postgres-b:

PLSQL
 
postgres-b-1 | 2024-05-01 07:10:45.127 GMT [81] LOG: CONFLICT: remote UPDATE on relation test_schema.employee (local index employee_pkey). Resolution: apply_remote.
postgres-b-1 | 2024-05-01 07:10:45.127 GMT [81] DETAIL: existing local tuple {id[int4]:1 firstname[text]:John lastname[text]:second looses email[text]:john1@doe.com age[int4]:18 salary[float4]:1234.23} xid=747,origin=0,timestamp=2024-05-01 07:10:45.125791+00; remote tuple {id[int4]:1 firstname[text]:John lastname[text]:first wins email[text]:john1@doe.com age[int4]:18 salary[float4]:1234.23} in xact origin=1,timestamp=2024-05-01 07:10:42.269227+00,commit_lsn=0/1618488
postgres-b-1 | 2024-05-01 07:10:45.127 GMT [81] CONTEXT: apply UPDATE from remote relation test_schema.employee in commit before 0/1618488, xid 748 committed at 2024-05-01 07:10:42.269227+00 (action #2) from node replorigin 1


Let’s check the result in the database.

PLSQL
 
postgres=# SELECT*FROM test_schema.employee;
 id | firstname |  lastname  |     email     | age | salary  
----+-----------+------------+---------------+-----+---------
  1 | John      | first wins | john1@doe.com |  18 | 1234.23


As expected the first transaction is the one that stayed.

To wrap it up:

  • We started two transactions in parallel
  • We changed the same row
  • We accepted the changes of the transaction that finished first

That’s it. Hope you had some fun and now you have another tool for your needs. In the next blog, we shall examine PostgreSQL’s driver capabilities and how we can configure an automated failover to another instance.

Replication (computing) PostgreSQL

Published at DZone with permission of Emmanouil Gkatziouras, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Setting Up Failover Slots in PostgreSQL-17
  • PostgreSQL Performance Metrics
  • Building a Real-Time Change Data Capture Pipeline With Debezium, Kafka, and PostgreSQL
  • Comparing Managed Postgres Options on The Azure Marketplace

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!