Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale
Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale
Read this article in order to learn how to stream data from MariaDB server into MariaDB Columnstore via MariaDB MaxScale.
Join the DZone community and get the full member experience.Join For Free
MariaDB TX, proven in production and driven by the community, is a complete database solution for any and every enterprise — a modern database for modern applications.
In this blog post, we look at how to configure Change Data Capture (CDC) from the MariaDB Server to MariaDB ColumnStore via MariaDB MaxScale. Our goal in this blog post is to have our analytical ColumnStore instance reflect the changes that happen on our operational MariaDB Server.
MariaDB MaxScale Configuration
We start by creating a MaxScale configuration with a binlogrouter and avrorouter instances. The former acts as a replication slave and fetches binary logs and the latter processes the binary logs into CDC records.
[replication-router] type=service router=binlogrouter user=maxuser passwd=maxpwd server_id=2 master_id=1 binlogdir=/var/lib/maxscale mariadb10-compatibility=1 filestem=mariadb-bin [replication-listener] type=listener service=replication-router protocol=MySQLClient port=3306 [avro-router] type=service router=avrorouter source=replication-router avrodir=/var/lib/maxscale [avro-listener] type=listener service=avro-router protocol=cdc port=4001
Copy the contents of this file into the `maxscale.cnf` file.
The docker-compose.yml File
The next step is to clone the MaxScale repository and to create the docker-compose file.
To clone the MaxScale repository, execute the following command.
git clone https://github.com/mariadb-corporation/MaxScale.git --branch=2.2 --depth=1
After the command completes, create the "docker-compose.yml" file with the following contents in the same directory where you cloned MaxScale.
version: '2' services: master: image: mariadb:10.2 container_name: master environment: MYSQL_ALLOW_EMPTY_PASSWORD: Y command: mysqld --log-bin=mariadb-bin --binlog-format=ROW --server-id=1 ports: - "3306:3306" maxscale: build: ./MaxScale/docker/ container_name: maxscale volumes: - ./maxscale.cnf:/etc/maxscale.cnf.d/maxscale.cnf ports: - "3307:3306" - "4001:4001" mcs: image: mariadb/columnstore_singlenode:latest container_name: mcs ports: - "3308:3306" adapter: image: centos:7 container_name: adapter command: /bin/sleep 0xffffffff
This file contains a MariaDB Server that acts as the master server, a MaxScale instance in a CDC configuration and a single-node ColumnStore container. We also use a plain CentOS 7 container where we install the adapter.
To start the cluster, run the following commands.
docker-compose build docker-compose up -d
The next step is to copy the ColumnStore configuration file from the "mcs" container and modify it to use the container hostname instead of the loopback address. To do this, execute the following commands.
docker cp mcs:/usr/local/mariadb/columnstore/etc/Columnstore.xml . sed -i 's/127.0.0.1/mcs/' Columnstore.xml docker cp Columnstore.xml adapter:/etc/Columnstore.xml
After we have copied the configuration file into the "adapter" container, we are ready to install the adapter.
To access the container, execute "docker-compose exec adapter bash." This will launch a new shell where the following commands will be executed.
yum -y install epel-release yum -y install https://downloads.mariadb.com/Data-Adapters/mariadb-columnstore-api/1.1.3/centos/x86_64/7/mariadb-columnstore-api-1.1.3-1-x86_64-centos7.rpm curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | bash yum -y install https://downloads.mariadb.com/Data-Adapters/mariadb-streaming-data-adapters/cdc-data-adapter/1.1.3/centos-7/mariadb-columnstore-maxscale-cdc-adapters-1.1.3-1-x86_64-centos7.rpm
After the adapter is installed, exit the shell.
Next, we can start preparing the data on the master server and configure the replication between it and MaxScale.
Preparing Data and Configuring Replication
We connect to the MariaDB Server running on the "master" container with the following command.
mysql -uroot -h 127.0.0.1 -P 3306
Once connected, executing the following SQL. This will prepare the server, create a table and insert some dummy data into the table. It also modified the data to emulate changes in the database.
RESET MASTER; CREATE USER 'maxuser'@'%' IDENTIFIED BY 'maxpwd'; GRANT ALL ON *.* TO 'maxuser'@'%'; CREATE DATABASE test; USE test; CREATE TABLE t1(id INT); INSERT INTO t1 VALUES (1), (2), (3); UPDATE t1 SET id = 4 WHERE id = 2; DELETE FROM t1 WHERE id = 3;
Once we have created some data, we configure the replication between MaxScale and the master server. To do this, execute the following command.
mysql -umaxuser -pmaxpwd -h 127.0.0.1 -P 3307 -e "CHANGE MASTER TO MASTER_HOST='master', MASTER_PORT=3306, MASTER_USER='maxuser', MASTER_PASSWORD='maxpwd', MASTER_LOG_FILE='mariadb-bin.000001', MASTER_LOG_POS=4; START SLAVE"
MaxScale will start to replicate events from the master server and process them into CDC records.
Create CDC User
To use the CDC system in MaxScale, we have to create a user for it. Execute the following command to create a user.
docker-compose exec maxscale maxctrl call command cdc add_user avro-router cdcuser cdcpassword
Starting the Adapter
We again execute the commands inside the adapter container. To access the container, execute "docker-compose exec adapter bash."
Once inside the container, we can try to start the adapter. Given that the table "test.t1" does not exist on ColumnStore, the adapter will give us an error when we try to start it:
[root@d444d5c5b820 /]$ mxs_adapter -c /etc/Columnstore.xml -u cdcuser -p cdcpassword -h maxscale -P 4001 test t1 Table not found, create with: CREATE TABLE test.t1 (domain int, event_number int, event_type varchar(50), id int, sequence int, server_id int, timestamp int) ENGINE=ColumnStore;
To create the table on ColumnStore, we have to exit the container. Once out of the container, we connect to the ColumnStore container and create the table described in the error message with the following command.
mysql -uroot -h 127.0.0.1 -P 3308 -e "CREATE TABLE test.t1 (domain int, event_number int, event_type varchar(50), id int, sequence int, server_id int, timestamp int) ENGINE=ColumnStore;"
Once the table is created, we go back into the adapter container with "docker-compose exec adapter bash" and try to start it again.
[root@d444d5c5b820 /]$ mxs_adapter -c /etc/Columnstore.xml -u cdcuser -p cdcpassword -h maxscale -P 4001 test t1 4 rows and 1 transactions inserted in 0.210798 seconds. GTID = 0-1-6 2 rows and 1 transactions inserted in 0.164197 seconds. GTID = 0-1-7
This time we see that it processed a total of six rows of data. We can now connect to the
ColumnStore on another terminal and see what the table contains.
[markusjm@localhost blog]$ mysql -uroot -h 127.0.0.1 -P 3308 -e "SELECT * FROM test.t1" +--------+--------------+---------------+------+----------+-----------+------------+ | domain | event_number | event_type | id | sequence | server_id | timestamp | +--------+--------------+---------------+------+----------+-----------+------------+ | 0 | 1 | insert | 1 | 5 | 1 | 1523948280 | | 0 | 2 | insert | 2 | 5 | 1 | 1523948280 | | 0 | 3 | insert | 3 | 5 | 1 | 1523948280 | | 0 | 1 | update_before | 2 | 6 | 1 | 1523948280 | | 0 | 2 | update_after | 4 | 6 | 1 | 1523948280 | | 0 | 1 | delete | 3 | 7 | 1 | 1523948281 | +--------+--------------+---------------+------+----------+-----------+------------+
The changes we did on the master MariaDB Server have been propagated to ColumnStore. To understand what the values are, we can map the SQL statements to the rows in the table.
The first SQL statement is "INSERT INTO t1 VALUES (1), (2), (3);", which inserts three values into the table. We see that the first three rows in the resultset are of type "insert" and the values match what we inserted.
The next SQL statement is "UPDATE t1 SET id = 4 WHERE id = 2;", which only touches one row. Although it modifies only one row in the database, it generated two rows in ColumnStore. This happened because the MaxScale CDC system stores both the before and after images of the modified row. This allows easy comparisons between new and old values.
The final SQL statement was "DELETE FROM t1 WHERE id = 3;", which deleted one row. This statement was converted to a delete entry with the data that was deleted (row with `id` of 3). This allows deleted data to be retained for analytical and auditing purposes without actually storing it on the master database.
Published at DZone with permission of Markus Mäkelä , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.