DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • How To Use KubeDB and Postgres Sidecar for Database Integrations in Kubernetes
  • Kubernetes Evolution: Transitioning from etcd to Distributed SQL
  • Advanced Kubernetes Setup for Spring Boot App With PostgreSQL DB
  • Using Envoy Proxy’s PostgreSQL and TCP Filters to Collect Yugabyte SQL Statistics

Trending

  • SaaS in an Enterprise - An Implementation Roadmap
  • Next Evolution in Integration: Architecting With Intent Using Model Context Protocol
  • Building Reliable LLM-Powered Microservices With Kubernetes on AWS
  • AI-Driven Root Cause Analysis in SRE: Enhancing Incident Resolution
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. PostgreSQL HA and Kubernetes

PostgreSQL HA and Kubernetes

I share my thoughts about how to set up a PostgreSQL Database in Kubernetes with some level of high availability, introducing 3 different architectural styles to do so.

By 
Ralph Soika user avatar
Ralph Soika
·
Jul. 30, 21 · Analysis
Likes (3)
Comment
Save
Tweet
Share
18.1K Views

Join the DZone community and get the full member experience.

Join For Free

In the following, I will share my thoughts about how to set up a PostgreSQL Database in Kubernetes with some level of high availability. For that, I will introduce 3 different architectural styles. I do not make a recommendation here because, as always every solution has its pros and cons.

1. Running PostgreSQL Outside of Kubernetes

In the first scenario, you run PostgreSQL outside form Kubernetes. This means Kubernetes does not know anything about the database. This situation is often a result of a historical architecture where PostgreSQL was long before Kubernetes in an evolving architecture.

Running PostgreSQL Outside of Kubernetes

The challenge here is that you need a deep understanding of how to install and set up PostgreSQL. The PostgreSQL cluster consists of a Master Node and a Standby Node. In case the Master Node failed for some reason (e.g., hardware or network defect) the standby server can overtake the role of the master. The PG-Bouncer in this picture is a component from Postgres and acts as a kind of reverse proxy server. In case of a failure, the switch from the Master to the Standby node can be done by an administrator or can be automated by scripts. From the view of a client, this switch is transparent.

From the perspective of Kubernetes, the PostgreSQL database is just an external resource that is not under control from Kubernetes. A business application running on a POD in Kubernetes connects directly via a public or shared network to the PG-Bouncer. If the PostgreSQL database is not reachable for some reason, Kubernetes can do nothing about this. So in this case you need some kind of additional monitoring.

2. Running PostgreSQL Inside of Kubernetes

In the second scenario, you run the PostgreSQL Cluster inside of Kubernetes. The Postgre- master- and standby-nodes as also the PG Bounder are now running on PODs inside the Kubernetes Cluster. There are a lot of tutorials on how to set up and deploy PostgreSQL in Kubernetes. The crunchy data project provides an open-source deployment configuration for Kubernetes.

The failover control of the Master and Standby PODs can now be handled by Kubernetes – e.g. with Kubernetes Operators. Your business application still connects to the PG-Bouncer which is now running on a POD inside Kubernetes. All the communication is now handled by the internal network of Kubernetes. 

The difficult part here is the storage. Kubernetes did not provide storage by itself, but it provides a lot of solutions based on the container storage interface (CSI). You need to set up a storage solution by yourself – either running inside your Kubernetes cluster or as an external solution. And of course, you still need a good understanding of PostgreSQL and the Kubernetes Operator concept. 

3. Running PostgreSQL on a Distributed Block Storage

So far both solutions were based on the HA concept of PostgreSQL using a PG-Bouncer and  Master and Standby nodes. In both cases, you need a good understanding of PostgreSQL and you need to either set up PostgreSQL on a separate infrastructure or you need to provide a Kubernetes storage solution. 

My last architectural concept is a more lightweight approach which first focuses on the question about storage. The storage problem in Kubernetes needs to be solved anyway, independent of the question if you run a PostgreSQL database or not. Most applications need some kind of persistence volume. For example, WordPress expects not only a  database but also a filesystem to store the media content. So storage should always be solved first if you start with Kubernetes. As explained before, Kubernetes provides a lot of concepts on how to deal with storage and the so-called persistence volumes.  

A common solution is a distributed block storage like provided by the open-source projects Longhorn or Ceph. Both systems integrate well into Kubernetes and can be set up easily. If you have already connected such a distributed block storage with your Kubernetes cluster, running a PostgreSQL database can become quite simple:

In this scenario each business application has its own PostgreSQL Database running in a POD. The data volume is provided by a Ceph storage system using the Ceph CSI plugin. Kubernetes controls the PostgreSQL POD and the storage volume. If an OSD or block device in Ceph has a failure, this is handled by the Ceph cluster and Kubernetes will react to failures thanks to the container storage interface (CSI). If something in Kubernetes goes wrong (e.g. a failure of a worker node) Kubernetes will reschedule the PostgreSQL POD internally.  With a Liveness and Readiness Check you can monitor your PostgreSQL database:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  labels: 
    app: postgres
spec:
  replicas: 1
  selector: 
    matchLabels:
      app: postgres
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - env:
        - name: POSTGRES_DB
          value: mydb
        - name: POSTGRES_PASSWORD
          value: xxxx
        - name: POSTGRES_USER
          value: myuser
        image: postgres:9.6.1
        name: postgres
        # Readiness and Liveness Probe
        readinessProbe:
          exec:
            command: ["psql", "-Umyuser", "-dmydb", "-c", "SELECT 1"]
          initialDelaySeconds: 10
          timeoutSeconds: 10
        livenessProbe:
          exec:
            command: ["psql", "-Umyuser", "-dmydb", "-c", "SELECT 1"]
          initialDelaySeconds: 30
          timeoutSeconds: 10
        ports:
          - containerPort: 5432        
        volumeMounts:
        - mountPath: /var/lib/postgresql/data
          name: dbdata
          subPath: postgres
      restartPolicy: Always
      volumes:
      - name: dbdata
        persistentVolumeClaim:
          claimName: dbdata


Conclusion

As I said in the beginning, I don’t want to make any recommendations, as every architecture has its pros and cons.  You have to think about what High Availability (HA) means to you and your project. PostgreSQL itself is very robust and stable and running it as a service in a single Kubernetes POD is a valid solution. In combination with a distributed block storage, this can be a lightweight and stable solution.  If you already have a PostgreSQL HA Cluster up and running, use it, and don’t try to squeeze everything into Kubernetes.

Kubernetes PostgreSQL

Published at DZone with permission of Ralph Soika. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How To Use KubeDB and Postgres Sidecar for Database Integrations in Kubernetes
  • Kubernetes Evolution: Transitioning from etcd to Distributed SQL
  • Advanced Kubernetes Setup for Spring Boot App With PostgreSQL DB
  • Using Envoy Proxy’s PostgreSQL and TCP Filters to Collect Yugabyte SQL Statistics

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!