SQL Prompt Code Analysis: INSERT INTO a Permanent Table With ORDER BY (PE020)
Let's look at a tutorial on how to use the Row_Number() window function in a SELECT query within the INSERTINTO statement instead of using the ORDERBY statement.
Join the DZone community and get the full member experience.
Join For FreeThe SQL query that is used to produce the result that is inserted into the permanent table has its order specified by an ORDERBY
statement. Relational tables are not ordered, so the ORDERBY
is meaningless. Use a Row_Number()
window clause instead if you need to impose a particular order on rows in the table via a query, especially if you need to allow subsequent insertions into the table. For temporary tables or table variables that will have no subsequent insertions, the current order can be maintained via an IDENTITY
field in the target table, but it is an anachronism now that there are better and more versatile ways of doing it.
SQL Prompt will raise a violation of a performance rule (PE020) when it detects use of INSERT INTO...ORDERBY
, when inserting into a permanent table.
It does so to catch a mistake that is easy to make, i.e. leaving in an ORDERBY
clause that is no longer needed. Before SQL Server 2012, these ORDERBY
clauses could be quite costly in terms of elapsed time and resources, but now they are generally ignored by the optimizer.
Meaningless ORDER BYs
Imagine that we have the query in Listing 1.
SELECT Coalesce(Person.Title + ' ', '') + Person.FirstName
+ Coalesce(' ' + Person.MiddleName + ' ', ' ') + Person.LastName
+ Coalesce(' ' + Person.Suffix, '') AS FullName
FROM AdventureWorks2016.Person.Person
ORDER BY Person.LastName, Person.FirstName;
Listing 1
The results are as follows:
Okay, we like this and the order it gives, so we attempt to insert the data into our work table, preserving the same order.
IF Object_Id('dbo.OurPeople1', 'U') IS NOT NULL DROP TABLE dbo.OurPeople1;
CREATE TABLE OurPeople1 (FullName VARCHAR(40) NOT NULL);
INSERT INTO OurPeople1 (FullName)
SELECT Coalesce(Person.Title + ' ', '') + Person.FirstName
+ Coalesce(' ' + Person.MiddleName + ' ', ' ') + Person.LastName
+ Coalesce(' ' + Person.Suffix, '') AS FullName
FROM AdventureWorks2016.Person.Person
ORDER BY Person.LastName, Person.FirstName;
Listing 2
Now we select from the table using:
SELECT OurPeople1.FullName FROM dbo.OurPeople1;
Listing 3
Of course, the contents of a table have no natural order, and so the use of ORDERBY
in Listing 2 is meaningless. If we had created a clustered index, on the FullName
column, to turn OurPeople1
into a table rather than a heap, then Listing 3, then we'd get a different order, which is the order of the clustered index.
In this example, the ORDERBY
clause in Listing 2 has been entirely ignored by the SQL Server. The only way to guarantee to return data in a specific order from our dbo.OurPeople1
table is to specify that order. If you want it ordered by Person.LastName
followed by Person.FirstName
, then first include the sort terms (LastName
and FirstName
) as columns in the work table, and then do:
SELECT OurPeople1.FullName FROM dbo.OurPeople1
ORDER BY OurPeople1.LastName, OurPeople1.FirstName;
Listing 4
You could add the PRIMARYKEY
field of the original data, i.e. add the BusinessEntity_ID
of the Person.Person
table, do a join, and then order by those missing fields as demonstrated in Listing 5.
IF Object_Id('dbo.OurPeople4', 'U') IS NOT NULL DROP TABLE dbo.OurPeople4;
CREATE TABLE dbo.OurPeople4 (BusinessEntityid INT NOT NULL, FullName VARCHAR(40) NOT NULL);
INSERT INTO dbo.OurPeople4 (BusinessEntityid, FullName)
SELECT Person.BusinessEntityID,
Coalesce(Person.Title + ' ', '') + Person.FirstName
+ Coalesce(' ' + Person.MiddleName + ' ', ' ') + Person.LastName
+ Coalesce(' ' + Person.Suffix, '')
FROM AdventureWorks2016.Person.Person;
/* and when you want an ordered list you do this ... */
SELECT OurPeople4.FullName
FROM dbo.OurPeople4
INNER JOIN AdventureWorks2016.Person.Person AS po
ON OurPeople4.BusinessEntityid = po.BusinessEntityID
ORDER BY po.LastName, po.FirstName;
Listing 5
Both these solutions preserve the order, whatever subsequent changes you make to the data.
The IDENTITY Trick
However, there is sometimes a reason why you can't reference the original data by which you did the ordering, and you find that you need to specify an order. It is possible to specify the order with a surrogate field of incrementing numbers, but you'll experience the downside to this solution if you add or amend the data so as to affect the sort order.
There was a time before the introduction of the Row_Number()
window function, that the only way that you could provide the order of the data in a work table was via the "IDENTITY
trick." You insert into a temporary table, to increment the IDENTITY
field in the order provided by the ORDERBY
clause.
IF Object_Id('dbo.OurPeople2', 'U') IS NOT NULL DROP TABLE dbo.OurPeople2;
CREATE TABLE dbo.OurPeople2 (FullName VARCHAR(40) NOT NULL, TheOrder INT NOT null);
DECLARE @People TABLE (TheIdentityField INT IDENTITY NOT NULL, FullName VARCHAR(40) NOT NULL, TheOrder AS TheIdentityField)
INSERT INTO @People (FullName)
SELECT Coalesce(Person.Title + ' ', '') + Person.FirstName
+ Coalesce(' ' + Person.MiddleName + ' ', ' ') + Person.LastName
+ Coalesce(' ' + Person.Suffix, '') AS FullName
FROM AdventureWorks2016.Person.Person
ORDER BY Person.LastName, Person.FirstName;
INSERT INTO dbo.OurPeople2 (FullName, TheOrder)
SELECT fullName, Theorder FROM @People
SELECT OurPeople2.FullName FROM dbo.OurPeople2 ORDER BY TheOrder
Listing 6
The results are:
That's a lot better, but why put the IDENTITY
field in a table variable? Why not just create it in the destination table? The problem is in doing insertions. The IDENTITY
field is immutable, so without this intermediate stage, you wouldn't be able to change the order, or to easily do any subsequent insertions other than to the beginning or end of your order.
Using the Row_Number() Window Function
Nowadays, we don't need any of that extra work: we don't use an ORDERBY
clause at all, and we can subsequently change the order if we need to. We just use the Row_Number()
window function.
IF Object_Id('dbo.OurPeople3', 'U') IS NOT NULL DROP TABLE dbo.OurPeople3;
CREATE TABLE dbo.OurPeople3 (FullName VARCHAR(40) NOT NULL, TheOrder INT NOT NULL);
INSERT INTO dbo.OurPeople3 (FullName, TheOrder)
SELECT Coalesce(Person.Title + ' ', '') + Person.FirstName
+ Coalesce(' ' + Person.MiddleName + ' ', ' ') + Person.LastName
+ Coalesce(' ' + Person.Suffix, ''),
Row_Number() OVER (ORDER BY Person.LastName, Person.FirstName)
FROM AdventureWorks2016.Person.Person;
SELECT OurPeople3.FullName FROM dbo.OurPeople3 ORDER BY OurPeople3.TheOrder;
Listing 7
Summary
Of all the code I've shown here, only Listing 2 triggers the PE020 warning. It only happens if you insert into a permanent table using a SELECT
statement with an ORDERBY
clause. It is simply advising you of an unnecessary ORDERBY
clause. If you are deliberately seeking to preserve a certain order on insertion, then use the Row_Number()
window function in that SELECT
query, within the INSERTINTO
statement instead of using the ORDERBY
statement.
Published at DZone with permission of Phil Factor. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments