How to Accelerate Application Performance With Smart SQL Queries
Thanks to this clause, you can instruct MySQL to run an Update statement in case the Insert statement goes wrong due to a possible duplicate key in the table.
Join the DZone community and get the full member experience.Join For Free
Why So Many Performance Issues Are Caused by the Database
We often forget that each request is not independent of other requests. If one request is slow, it’s unlikely to affect the others…right?
The database is a shared resource used by all processes that run in your application. Even just one poorly designed access can hurt the performance of the whole system.
In this article, I’ll talk about some smart approaches to SQL queries that have completely changed some parts of our system, improving application performance for a better customer experience.
If you deal with large datasets, import/export processes, data aggregation algorithms, and similar problems these solutions can help you drastically reduce the consumption of resources by your application and achieve significant economic savings.
INSERT on Duplicate Key UPDATE
insert on duplicate key update is one of the lesser-known MySQL clauses, but that guarantees incredible performance improvements in some specific cases, that can literally save your customer's experience.
Thanks to this clause, you can instruct MySQL to run a
UPDATE statement in case the
INSERT statement goes wrong due to a possible duplicate key in the table.
Let me show a real-world example.
Let’s imagine a process of importing a list of users from a CSV file, where each row needs to have a unique email address. Our script should insert new users and update a specific user if the email address already exists.
The first version of this script could be:
For each row, we verify if the user with the given email already exists in the database. If the user exists the script updates its name and then saves if the user does not exist the script creates a new instance of User and then proceeds with the insert.
In this example, we are using the Eloquent ORM to interact with the database and the “save()” method is smart enough to update a record if it does not exists, create it otherwise. In the end, we run a select to grab the user from the database, and another query to INSERT or UPDATE the record, so two queries for each line in the CSV file.
This means that for a CSV with 500.000 rows we need to run 1 million queries (500K select, 500K insert, or update).
Simplify the Code
Eloquent, as well as every other decent ORM, provides some shortcut to accomplish this kind of operation, so we can use
updateOrCreate method to reduce the number of lines for better readability:
The method has a really clear name and provides useful functionality, but this is not enough, because it contains the same issue: it runs two queries per CSV row.
Too many queries mean too much time, CPU, and memory usage. Instead, we aim to reduce the number of database statements to optimize performance and resource consumption by the script.
How to Use the “On Duplicate Key”
This clause is like a “try/catch” statement but for SQL. Here is a raw example:
It has a really simple behavior:
- Try to insert a record with the given information;
- If there are no errors, it performs the insert as usual;
- If the query fires a “duplicate key” error it proceeds with the second query provided;
Thanks to this clause we can move the “if” a statement from PHP to the database, halving the number of requests against the database itself.
Let’s Go Further
We can use this SQL statement also for bulk operations to obtain an exponential performance improvement. We can add multiple INSERT and use the VALUES function to reference the correct field like a variable inside a cycle.
We could import the entire CSV with one query…in theory.
In a real-life scenario, a query has a length limit and it may be more prudent not to do everything in one operation to avoid out-of-memory errors. We can chunk the CSV in subarrays of 1.000 items and run queries with 1.000 INSERT inside:
1.000 is just an example, based on the server resources you could increase or decrease this number.
The most important thing is that we have reduced the number of queries from 500.000 to 500.
Eloquent UPSERT Method
Eloquent ORM provides the
upsert method that implements this strategy for you under the hood.
The method’s first argument consists of the values to insert or update, while the second argument lists the column(s) that uniquely identify records within the associated table. The method’s third and final argument is an array of the columns that should be updated if a matching record already exists in the database.
To allow the method to do its job it requires the columns in the second argument of the upsert method to have a “primary” or “unique” index.
I hope that one or more of these tips can help you to create a more solid and scalable software product.
I have used Eloquent ORM to write code examples, but you can use this strategy with all major ORMs out there in the same way. As I often say, tools should help us to implement an efficient strategy. Strategic thinking is the key to give a long-term perspective to our products.
New to Inspector?
Create a monitoring environment specifically designed for software developers avoiding any server or infrastructure configuration that many developers hate to deal with.
Thanks to Inspector, you will never have the need to install things at the server level or make complex configurations in your cloud infrastructure.
Inspector works with a lightweight software library that you can install in your application like any other dependencies. In the case of Laravel, you have our official Laravel package at your disposal. Developers are not always comfortable installing and configuring software at the server level, because these installations are out of the software development lifecycle, or are even managed by external teams.
Visit our website for more details.
Published at DZone with permission of Valerio Barbera. See the original article here.
Opinions expressed by DZone contributors are their own.