Red Hat Integration CDC Debezium Connectors Are GA Now
Red Hat is announcing the general availability (GA) of the Debezium Apache Kafka connectors for change data capture (CDC).
Join the DZone community and get the full member experience.
Join For FreeRed Hat is announcing the general availability (GA) of the Debezium Apache Kafka connectors for change data capture (CDC) as part of the 2020-Q1 release of Red Hat Integration. With Red Hat Integration CDC connectors, you get access to the benefits of open source for the enterprise—like community-driven upstream innovation—delivered with enterprise-level support to help your organization safely use open source technology.
Red Hat Integration is a comprehensive set of integration and messaging technologies to connect applications and data across hybrid infrastructures. It is an agile, distributed, containerized, and API-centric solution. It provides service composition and orchestration, application connectivity and data transformation, real-time message streaming, change data capture and API management—all combined with a cloud-native platform and toolchain to support the full spectrum of modern application development.
This Red Hat Integration release provides full support for Debezium connectors for capturing changes from the following databases:
MySQL Connector
PostgreSQL Connector
Additionally, the following connectors are also included and delivered as Technical Preview:
MongoDB Connector
SQL Server Connector
The Debezium connectors are based on the popular Apache Kafka Connect API and are suitable to be deployed along Red Hat AMQ Streams Kafka clusters. You can check the complete supported configurations for more information.
What is Change Data Capture (CDC)?
Change Data Capture, or CDC, is a well-established software design pattern for a system that monitors and captures the changes in data so that other software can respond to those changes. CDC captures row-level changes to database tables and passes corresponding change events to a data streaming bus. Applications can read these change event streams and access the change events in the order in which they occurred.
Meanwhile, Debezium is a set of distributed services that captures row-level changes in databases so that applications can see and respond to those changes. Debezium connectors record all events to a Red Hat AMQ Streams Kafka cluster, and applications consume those events through AMQ Streams.
Using CDC and Event-Driven Microservices
Using Debezium log-based change data capture has the advantage of capturing all the data changes that were registered in the database, providing a trustable source. There is no delay happening compared with querying the database or overhead. It provides a transparent mechanism for applications and data models, avoiding the need to pollute current systems design. Reading for the transaction log also provides a low overhead with no risk of missing any events compared with polling.
Additionally, it allows the capture of delete events as well as information about the old record state and further metadata that can be then shared as part of the event for further processing. The Debezium change event structure includes information about the key of the table, information regarding the value with the previous state, current changes, and metadata info. These events can be serialized as the familiar JSON or Avro formats, while support for CloudEvents is coming in a future release.
Some of the use cases where change data capture becomes pretty useful include:
Data replication, to other databases, to feed data to other teams or as streams for analytics, data lakes or data warehouses
Microservices data exchange, to propagate data between different services without coupling, keeping optimized views locally, or for monolith to microservices evolutions.
General usage, including auditing, cache invalidation, indexing for full text search or for updating CQRS read models.
Many organizations already use the upstream Debezium community project in production. Like Convoy, a 700+ employee freight organization, able to deliver a low latency Extract Load Transform (ELT) pipeline using Apache Kafka and Debezium after some prototyping and development. Convoy moved from 1-2 hours up to 6+ hours data imports to just minutes by monitoring for changes to rows in Postgres using the Debezium Postgres Source Connector and Single Message Transformations to flatten events.
As we can see, change data capture is one of the tools used to bridge traditional data stores and new cloud-native event-driven architectures. Red Hat Integration is a valuable 100% open source solution providing the components to use as the foundation for your systems.
Get started by downloading now the Red Hat Integration Debezium CDC connectors from the Red Hat Developers Site. Also, don’t forget to check Gunnar Morling’s webinar on Debezium and Kafka from the DevNation’s series or his talk at QCon.
Opinions expressed by DZone contributors are their own.
Trending
-
Personalized Code Searches Using OpenGrok
-
How To Use Geo-Partitioning to Comply With Data Regulations and Deliver Low Latency Globally
-
Manifold vs. Lombok: Enhancing Java With Property Support
-
Hiding Data in Cassandra
Comments