Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

MySQL Auto Incremented Tables and Upserts

DZone's Guide to

MySQL Auto Incremented Tables and Upserts

Sometimes, performing an upsert on MySQL tables will waste auto increment values. Take a look at how to prevent this issue by avoiding two types of queries.

· Performance Zone
Free Resource

According to best practices, when modeling a MySQL relation that has a UUID as its id, you should still create a hidden numeric auto_increment field for the primary key. Doing so has all sorts of performance advantages (see this post from the founder of Percona or these interesting benchmarks).

However, if you're used to writing upsert queries with MySQL's ON DUPLICATE KEY UPDATE... syntax, you'll soon find that doing so is a recipe for wasting auto increment values.

For example, consider the following create table statement:

CREATE TABLE contrived_examples 
  ( 
     hidden_id  BIGINT auto_increment NOT NULL, 
     id         BINARY(16) NOT NULL UNIQUE, 
     updateable INT NOT NULL, 
     PRIMARY KEY (hidden_id) 
  ) 
engine=innodb 
DEFAULT charset=utf8mb4; 

Performing an upsert on the table will consume an auto increment value each time, regardless of whether the query only updates a row.

INSERT INTO contrived_examples 
            ( 
                        id, 
                        updateable 
            ) 
            VALUES 
            ( 
                        0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, 
                        42 
            ) 
on duplicate KEY 
UPDATE updateable=VALUES 
       ( 
              updateable 
       );

In the insert case, a record with the specific UUID doesn't exist, and the query auto increments hidden_id as you would expect. However, when you trigger the duplicate key part of the query during an update, an auto increment value is also consumed because MySQL first tries to perform the update (consuming an id), detects the duplicate, and then performs the update.

This is a simple oversight that is easy to make, but left unchecked, can waste much of your id space. It should also be noted that the same problem can be experienced when using INSERT IGNORE.... To avoid either type of query, you should do the following:

  1. Try to insert the record (INSERT INTO contrived_examples (id, updateable) VALUES (0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, 42)).
  2. Check/catch the case that no rows were inserted.
  3. If no rows were inserted, update the existing row (UPDATE contrived_examples SET updateable=42 WHERE id=0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa).

This issue is why I spent time changing all our fancy upserts into separate update and insert queries shortly after our DevOps Monitoring Platform, Blue Matador, was released. We monitor servers and applications all day every day, and it didn't take long for us to notice a very large gap in the ids that were being generated.

Hopefully, this post helps you avoid the same level of tedium.

Topics:
performance ,mysql

Published at DZone with permission of Mark Siebert, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}