DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. Identifying Useful Info From MySQL Row-based Binary Logs

Identifying Useful Info From MySQL Row-based Binary Logs

Peter Zaitsev user avatar by
Peter Zaitsev
·
Feb. 12, 15 · Interview
Like (0)
Save
Tweet
Share
8.82K Views

Join the DZone community and get the full member experience.

Join For Free

Originally Written by Alok Pathak

As a MySQL DBA/consultant, it is part of my job to decode the MySQL binary logs – and there are a number of reasons for doing that. In this post, I’ll explain how you can get the important information about your write workload using MySQL row-based binary logs and a simple awk script.

First, it is important to understand that row-based binary logs contain the actual changes done by a query. For example, if I run a delete query against a table, the binary log will contain the rows that were deleted. MySQL provides the mysqlbinlog utility to decode the events stored in MySQL binary logs. You can read more about mysqlbinlog in detail in the reference manual here.

The following example illustrates how mysqlbinlog displays row events that specify data modifications. These correspond to events with the WRITE_ROWS_EVENT, UPDATE_ROWS_EVENT, and DELETE_ROWS_EVENT type codes.

We will use following options of mysqlbinlog.
–base64-output=decode-rows
–verbose, -v
–start-datetime=”datetime”
–stop-datetime=”datetime”

We have a server running with row based binary logging.

mysql> show variables like '%binlog_format%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.00 sec)

We created a test table and executed the following sequence of statements under a transaction.

use test;
CREATE TABLE t
(
  id   INT NOT NULL,
  name VARCHAR(20) NOT NULL,
  date DATE NULL
) ENGINE = InnoDB;
START TRANSACTION;
INSERT INTO t VALUES(1, 'apple', NULL);
UPDATE t SET name = 'pear', date = '2009-01-01' WHERE id = 1;
DELETE FROM t WHERE id = 1;
COMMIT;

Now let’s see how it is represented in binary logs.

# mysqlbinlog --base64-output=decode-rows -vv  --start-datetime="2015-01-12 21:40:00"  --stop-datetime="2015-01-12 21:45:00"  mysqld-bin.000023
/*!*/;
# at 295
#150112 21:40:14 server id 1  end_log_pos 367 CRC32 0x19ab4f0f 	Query	thread_id=108	exec_time=0	error_code=0
SET TIMESTAMP=1421079014/*!*/;
BEGIN
/*!*/;
# at 367
#150112 21:40:14 server id 1  end_log_pos 415 CRC32 0x6b1f2240 	Table_map: `test`.`t` mapped to number 251
# at 415
#150112 21:40:14 server id 1  end_log_pos 461 CRC32 0x7725d174 	Write_rows: table id 251 flags: STMT_END_F
### INSERT INTO `test`.`t`
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */
# at 461
#150112 21:40:14 server id 1  end_log_pos 509 CRC32 0x7e44d741 	Table_map: `test`.`t` mapped to number 251
# at 509
#150112 21:40:14 server id 1  end_log_pos 569 CRC32 0x0cd1363a 	Update_rows: table id 251 flags: STMT_END_F
### UPDATE `test`.`t`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='apple' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3=NULL /* VARSTRING(20) meta=0 nullable=1 is_null=1 */
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
# at 569
#150112 21:40:14 server id 1  end_log_pos 617 CRC32 0xf134ad89 	Table_map: `test`.`t` mapped to number 251
# at 617
#150112 21:40:14 server id 1  end_log_pos 665 CRC32 0x87047106 	Delete_rows: table id 251 flags: STMT_END_F
### DELETE FROM `test`.`t`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='pear' /* VARSTRING(20) meta=20 nullable=0 is_null=0 */
###   @3='2009:01:01' /* DATE meta=0 nullable=1 is_null=0 */
# at 665
#150112 21:40:15 server id 1  end_log_pos 696 CRC32 0x85ffc9ff 	Xid = 465
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
  • Row with “Table_map: test.t” defines the table name used by query.
  • Rows with “Write_rows/Update_rows/Delete_rows” defines the event type.
  • Lines that start with “###” defines the actual row that got changed.
  • Columns are represented as @1, @2 and so on.

Now have a look at our simple awk script that will use the mysqlbinlog output and print a beautiful summary for INSERT/UPDATE/DELETE events from row-based binary logs. Just replace the “mysqld-bin.000023″ with your binary log file. The string “#15″ in third line is for year 2015. If you are decoding a binary log file from 2014, just change it to “#14″. It is also recommended to use “–start-datetime” and ” –stop-datetime” options to decode binary logs of a specific time range instead of decoding a large binary log file.

Script :

mysqlbinlog --base64-output=decode-rows -vv --start-datetime="2015-01-12 21:40:00"  --stop-datetime="2015-01-12 21:45:00" mysqld-bin.000023 | awk \
'BEGIN {s_type=""; s_count=0;count=0;insert_count=0;update_count=0;delete_count=0;flag=0;} \
{if(match($0, /#15.*Table_map:.*mapped to number/)) {printf "Timestamp : " $1 " " $2 " Table : " $(NF-4); flag=1} \
else if (match($0, /(### INSERT INTO .*..*)/)) {count=count+1;insert_count=insert_count+1;s_type="INSERT"; s_count=s_count+1;}  \
else if (match($0, /(### UPDATE .*..*)/)) {count=count+1;update_count=update_count+1;s_type="UPDATE"; s_count=s_count+1;} \
else if (match($0, /(### DELETE FROM .*..*)/)) {count=count+1;delete_count=delete_count+1;s_type="DELETE"; s_count=s_count+1;}  \
else if (match($0, /^(# at) /) && flag==1 && s_count>0) {print " Query Type : "s_type " " s_count " row(s) affected" ;s_type=""; s_count=0; }  \
else if (match($0, /^(COMMIT)/)) {print "[Transaction total : " count " Insert(s) : " insert_count " Update(s) : " update_count " Delete(s) : " \
delete_count "] \n+----------------------+----------------------+----------------------+----------------------+"; \
count=0;insert_count=0;update_count=0; delete_count=0;s_type=""; s_count=0; flag=0} } '

Output :

Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : INSERT 1 row(s) affected
Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : UPDATE 1 row(s) affected
Timestamp : #150112 21:40:14 Table : `test`.`t` Query Type : DELETE 1 row(s) affected
[Transaction total : 3 Insert(s) : 1 Update(s) : 1 Delete(s) : 1]
+----------------------+----------------------+----------------------+----------------------+

This awk script will return following columns as output for every transaction in binary logs.

Timestamp : timestamp of event as logged in binary log.
Table : database.tablename
Query Type : Type of query executed on table and number of rows affected by query.

[Transaction total : 3 Insert(s) : 1 Update(s) : 1 Delete(s) : 1]
Above line print the summary of transaction, it displays the total number of rows affected by transaction, total number of rows affected by each type of query in transaction.

Let’s execute some more queries in sakila database and then we will summarize them using our script. We assume that we executed these queries between “2015-01-16 13:30:00″ and “2015-01-16 14:00:00″

use sakila;
mysql> update city set city="Acua++" where city_id=4;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql> delete from country where country_id=4;
Query OK, 1 row affected (0.04 sec)
mysql>  insert into country (country_id,country,last_update) values ('4','Angola',now());
Query OK, 1 row affected (0.03 sec)
mysql> create table payment_tmp select * from payment;
Query OK, 16049 rows affected (0.80 sec)
Records: 16049  Duplicates: 0  Warnings: 0
mysql> delete from payment_tmp where payment_id between 10000 and 15000;
Query OK, 5001 rows affected (0.08 sec)
mysql> update payment_tmp set amount=9.99 where payment_id between 1 and 7000;
Query OK, 6890 rows affected (0.08 sec)
Rows matched: 7000  Changed: 6890  Warnings: 0
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> delete from country where country_id=5;
Query OK, 1 row affected (0.00 sec)
mysql> delete from city where country_id=5;
Query OK, 1 row affected (0.00 sec)
mysql> delete from address where city_id=300;
Query OK, 2 rows affected (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.01 sec)

We will save our script in a file named “summarize_binlogs.sh” and summarize the queries we just executed.

# vim summarize_binlogs.sh
#!/bin/bash
BINLOG_FILE="mysqld-bin.000035"
START_TIME="2015-01-16 13:30:00"
STOP_TIME="2015-01-16 14:00:00"
mysqlbinlog --base64-output=decode-rows -vv --start-datetime="${START_TIME}"  --stop-datetime="${STOP_TIME}" ${BINLOG_FILE} | awk \
'BEGIN {s_type=""; s_count=0;count=0;insert_count=0;update_count=0;delete_count=0;flag=0;} \
{if(match($0, /#15.*Table_map:.*mapped to number/)) {printf "Timestamp : " $1 " " $2 " Table : " $(NF-4); flag=1} \
else if (match($0, /(### INSERT INTO .*..*)/)) {count=count+1;insert_count=insert_count+1;s_type="INSERT"; s_count=s_count+1;}  \
else if (match($0, /(### UPDATE .*..*)/)) {count=count+1;update_count=update_count+1;s_type="UPDATE"; s_count=s_count+1;} \
else if (match($0, /(### DELETE FROM .*..*)/)) {count=count+1;delete_count=delete_count+1;s_type="DELETE"; s_count=s_count+1;}  \
else if (match($0, /^(# at) /) && flag==1 && s_count>0) {print " Query Type : "s_type " " s_count " row(s) affected" ;s_type=""; s_count=0; }  \
else if (match($0, /^(COMMIT)/)) {print "[Transaction total : " count " Insert(s) : " insert_count " Update(s) : " update_count " Delete(s) : " \
delete_count "] \n+----------------------+----------------------+----------------------+----------------------+"; \
count=0;insert_count=0;update_count=0; delete_count=0;s_type=""; s_count=0; flag=0} } '
:wq
# chmod u+x summarize_binlogs.sh

Now we run our script to get the summary of information logged in binary log.

# ./summarize_binlogs.sh
Timestamp : #150116 13:41:09 Table : `sakila`.`city` Query Type : UPDATE 1 row(s) affected
[Transaction total : 1 Insert(s) : 0 Update(s) : 1 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:41:59 Table : `sakila`.`country` Query Type : DELETE 1 row(s) affected
[Transaction total : 1 Insert(s) : 0 Update(s) : 0 Delete(s) : 1]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:07 Table : `sakila`.`country` Query Type : INSERT 1 row(s) affected
[Transaction total : 1 Insert(s) : 1 Update(s) : 0 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected
[Transaction total : 16049 Insert(s) : 16049 Update(s) : 0 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected
[Transaction total : 5001 Insert(s) : 0 Update(s) : 0 Delete(s) : 5001]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected
[Transaction total : 6890 Insert(s) : 0 Update(s) : 6890 Delete(s) : 0]
+----------------------+----------------------+----------------------+----------------------+
Timestamp : #150116 13:42:42 Table : `sakila`.`country` Query Type : DELETE 1 row(s) affected
Timestamp : #150116 13:42:48 Table : `sakila`.`city` Query Type : DELETE 1 row(s) affected
Timestamp : #150116 13:42:53 Table : `sakila`.`address` Query Type : DELETE 2 row(s) affected
[Transaction total : 4 Insert(s) : 0 Update(s) : 0 Delete(s) : 4]
+----------------------+----------------------+----------------------+----------------------+

See how nicely our script has summarized the information logged in binary log. For the last transaction, it shows 3 rows that corresponds to each statement executed within transaction and then the final summary of transaction i.e. transaction affected 4 rows in total and all of them were deleted.

This was how we can summarize MySQL row-based binary logs in a more readable format using a simple awk script. This will save you time whenever you need to decode the binary logs for troubleshooting. Now I’ll show you how this summarized information can help answer some basic but important questions.

Q1 : Which tables received highest number of insert/update/delete statements?

./summarize_binlogs.sh | grep Table |cut -d':' -f5| cut -d' ' -f2 | sort | uniq -c | sort -nr
      3 `sakila`.`payment_tmp`
      3 `sakila`.`country`
      2 `sakila`.`city`
      1 `sakila`.`address`

Q2 : Which table received the highest number of DELETE queries?

./summarize_binlogs.sh | grep -E 'DELETE' |cut -d':' -f5| cut -d' ' -f2 | sort | uniq -c | sort -nr
      2 `sakila`.`country`
      1 `sakila`.`payment_tmp`
      1 `sakila`.`city`
      1 `sakila`.`address`

Q3: How many insert/update/delete queries executed against sakila.country table?

./summarize_binlogs.sh | grep -i '`sakila`.`country`' | awk '{print $7 " " $11}' | sort -k1,2 | uniq -c
      2 `sakila`.`country` DELETE
      1 `sakila`.`country` INSERT

Q4: Give me the top 3 statements which affected maximum number of rows.

./summarize_binlogs.sh | grep Table | sort -nr -k 12 | head -n 3
Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected
Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected

See how easy it is to identify the large transactions.

Q5 : Find DELETE queries that affected more than 1000 rows.

./summarize_binlogs.sh | grep -E 'DELETE' | awk '{if($12>1000) print $0}'
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected

If we want to get all queries that affected more than 1000 rows.

./summarize_binlogs.sh | grep -E 'Table' | awk '{if($12>1000) print $0}'
Timestamp : #150116 13:42:13 Table : `sakila`.`payment_tmp` Query Type : INSERT 16049 row(s) affected
Timestamp : #150116 13:42:20 Table : `sakila`.`payment_tmp` Query Type : DELETE 5001 row(s) affected
Timestamp : #150116 13:42:28 Table : `sakila`.`payment_tmp` Query Type : UPDATE 6890 row(s) affected

Conclusion

The information logged in binary logs is really helpful to understand our write workload. We used mysqlbinlog + awk to present the binary log contents in more readable format. Further we used the summarized information to answer some of the important questions.


Database MySQL

Published at DZone with permission of Peter Zaitsev, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Stream Processing vs. Batch Processing: What to Know
  • Microservices Discovery With Eureka
  • An Introduction to Data Mesh
  • Quick Pattern-Matching Queries in PostgreSQL and YugabyteDB

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: