Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Analyze MySQL Audit Logs With ClickHouse and ClickTail

DZone's Guide to

Analyze MySQL Audit Logs With ClickHouse and ClickTail

Learn how to do some advanced work with MySQL audit logs — specifically, analyzing them — with ClickHouse and ClickTail.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

In this blog post, I'll look at how you can analyze MySQL audit logs with ClickHouse and ClickTail.

Audit logs are available with a free plugin for Percona Server for MySQL. Besides providing insights about activity on your server, you might need the logs for compliance purposes.

However, on an active server, the logs can get very large. Under a sysbench-tpcc workload, for example, I was able to generate 24GB worth of logs just within one hour.

So, we are going to use the ClickTail tool, which Peter Zaitsev mentioned in Analyze Your Raw MySQL Query Logs With ClickHouse and the Altinity team describes in the ClickTail Introduction.

Clicktail extracts all fields available in Percona Server for MySQL's audit log in JSON format, as you can see in Schema. I used the command:

clicktail --dataset='clicktail.mysql_audit_log' --parser=mysqlaudit --file=/mnt/nvmi/mysql/audit.log --backfill

In my setup, ClickTail imported records at the rate of 1.5 to 2 million records/minute. Once we have ClickTail set up, we can do some work on audit logs. Below are some examples of queries.

Check if some queries were run with errors:

SELECT
    status AS c1,
    count(*)
FROM mysql_audit_log
GROUP BY c1
┌───c1─┬──count()─┐
│    0 │ 46197504 │
│ 1160 │        1 │
│ 1193 │     1274 │
│ 1064 │     5096 │
└──────┴──────────┘
4 rows in set. Elapsed: 0.018 sec. Processed 46.20 million rows, 184.82 MB (2.51 billion rows/s., 10.03 GB/s.)

First, it is very impressive to see the performance of 2.5 billion row/s analyzed. And second, there are really some queries with non-zero (errors) statuses.

We can dig into and check what exactly caused an 1193 error (MySQL Error Code: 1193. Unknown system variable):

SELECT *
FROM mysql_audit_log
WHERE status = 1193
LIMIT 1
┌───────────────_time─┬──────_date─┬─_ms─┬─command_class─┬─connection_id─┬─db─┬─host──────┬─ip─┬─name──┬─os_user─┬─os_login─┬─os_version─┬─mysql_version─┬─priv_user─┬─proxy_user─┬─record───────────────────────┬─sqltext────────────────────────────┬─status─┬─user──────────────────────┬─startup_optionsi─┐
│ 2018-03-12 20:34:49 │ 2018-03-12 │   0 │ select        │          1097 │    │ localhost │    │ Query │         │          │            │               │           │            │ 39782055_2018-03-12T20:21:21 │ SELECT @@query_response_time_stats │   1193 │ root[root] @ localhost [] │                  │
└─────────────────────┴────────────┴─────┴───────────────┴───────────────┴────┴───────────┴────┴───────┴─────────┴──────────┴────────────┴───────────────┴───────────┴────────────┴──────────────────────────────┴────────────────────────────────────┴────────┴───────────────────────────┴──────────────────┘

So, this was SELECT @@query_response_time_stats, which I believe comes from the Percona Monitoring and Management (PMM) MySQL Metrics exporter.

Similarly, we can check what queries types were run on MySQL:

SELECT
    command_class,
    count(*)
FROM mysql_audit_log
GROUP BY command_class
┌─command_class────────┬──count()─┐
│                      │    15882 │
│ show_storage_engines │     1274 │
│ select               │ 26944474 │
│ error                │     5096 │
│ show_slave_status    │     1274 │
│ begin                │  1242555 │
│ update               │  9163866 │
│ show_tables          │      204 │
│ show_status          │     6366 │
│ insert_select        │      170 │
│ delete               │   539058 │
│ commit               │  1237074 │
│ create_db            │        2 │
│ show_engine_status   │     1274 │
│ show_variables       │      450 │
│ set_option           │     8102 │
│ create_table         │      180 │
│ rollback             │     5394 │
│ create_index         │      120 │
│ insert               │  7031060 │
└──────────────────────┴──────────┘
20 rows in set. Elapsed: 0.120 sec. Processed 46.20 million rows, 691.84 MB (385.17 million rows/s., 5.77 GB/s.)

There are more fields available to understand who accessed a MySQL instance and from where, like:

db String,
host String,
ip String,

If you ever need to do some advanced work with MySQL audit logs, consider doing it with ClickHouse and ClickTail!

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
tutorial ,mysql ,auditing ,logging ,data analytics ,clickhouse ,clicktail ,big data

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}