Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Automating Automatic Indexing in Azure SQL Database

DZone's Guide to

Automating Automatic Indexing in Azure SQL Database

The goal here is actually really simple. I just want automatic indexing to fire. Here, I do *just enough* work to make that happen.

· Database Zone ·
Free Resource

Discover Tarantool's unique features which include powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU core!

I’ve been in love with the concept of a database-as-a-service ever since I first laid eyes on Azure SQL Database. It just makes sense to me. Take away the mechanics of server management and database management. Focus on the guts of your database. Backups, consistency checks — these easily automated aspects can just be taken care of. The same thing goes with some, not all, but some, index management. Azure SQL Database can manage your indexes for you. I call it weaponizing Query Store.

Anyway, I needed a way to automate this for the book I’m writing. I couldn’t find any good examples online, so I built my own.

Queries in Need of Automatic Indexing

Because I want this to be as simple and repeatable as possible, I’m using the sample database that you can create when you create an Azure SQL Database. It’s basically a very tiny subset of AdventureWorks, simplified and reduced in size. With that as a starting point, you need some queries:

CREATE OR ALTER PROCEDURE dbo.CustomerInfo
(@Firstname NVARCHAR(50))
AS
SELECT c.FirstName,
       c.LastName,
       c.Title,
       a.City
FROM SalesLT.Customer AS c
    JOIN SalesLT.CustomerAddress AS ca
        ON ca.CustomerID = c.CustomerID
    JOIN SalesLT.Address AS a
        ON a.AddressID = ca.AddressID
WHERE c.FirstName = @Firstname;
GO
 
 
CREATE INDEX TestDuplicate ON SalesLT.Customer (EmailAddress)
GO
 
CREATE OR ALTER PROCEDURE dbo.EmailInfo (@EmailAddress nvarchar(50))
AS
SELECT c.EmailAddress,
       c.Title,
       soh.OrderDate
FROM SalesLT.Customer AS c
    JOIN SalesLT.SalesOrderHeader AS soh
        ON soh.CustomerID = c.CustomerID
WHERE c.EmailAddress = @EmailAddress;
GO
 
CREATE OR ALTER PROCEDURE dbo.SalesInfo (@firstName NVARCHAR(50))
AS
SELECT c.FirstName,
       c.LastName,
       c.Title,
       soh.OrderDate
FROM SalesLT.Customer AS c
    JOIN SalesLT.SalesOrderHeader AS soh
        ON soh.CustomerID = c.CustomerID
WHERE c.FirstName = @firstName
GO
 
 
CREATE OR ALTER PROCEDURE dbo.OddName (@FirstName NVARCHAR(50))
AS
SELECT c.FirstName
FROM SalesLT.Customer AS c
WHERE c.FirstName BETWEEN 'Brian'
                  AND     @FirstName
GO

What you’ll notice is that several of the queries are filtering on the FirstName column. There’s no good index there. If you look at the execution plans for those queries, you’ll also note the missing index suggestion. That suggestion is a necessary part of the automatic indexing. Yeah, missing indexes. I know. They’re not always accurate. It’s just a suggestion. Blah, blah, blah. I hear you.

The magic is not supplied by missing indexes. The magic is supplied by lots of data. Microsoft can take advantage of three things. Yes, missing index suggestions is first. Then, they can use the query metrics gathered in Query Store to see the behavior of your queries over time. Finally, they can use machine learning algorithms to determine if indexes will be helpful and measure how helpful they’ve been if one gets added. It’s great stuff. Go and read on it.

Automating Queries for Automatic Indexing

The key to getting your indexes automatically added though is time. You can’t run one query once, get a missing index, and Microsoft will add it for you. They’re not crazy or stupid. You need a lot of executions (exact number unknown to me). There has to be a lot of executions over time (approximate time based on testing, 12-18 hours). Finally, you also need more than one query (again, how many, I don’t know, but having only one or two did not cross the threshold, so I went for four, which did).

So, how did I run these procs for hours? Easy. Powershell:

$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = 'Server=qpf.database.windows.net;Database=QueryPerformanceTuning;trusted_connection=false;user=Grant;password=icanttellyou'
 
## load customer names
$DatCmd = New-Object System.Data.SqlClient.SqlCommand
$DatCmd.CommandText = "SELECT c.FirstName, c.EmailAddress
FROM SalesLT.Customer AS c;"
$DatCmd.Connection = $SqlConnection
$DatDataSet = New-Object System.Data.DataSet
$SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
$SqlAdapter.SelectCommand = $DatCmd
$SqlAdapter.Fill($DatDataSet)
 
 
$Proccmd = New-Object System.Data.SqlClient.SqlCommand
$Proccmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$Proccmd.CommandText = "dbo.CustomerInfo"
$Proccmd.Parameters.Add("@FirstName",[System.Data.SqlDbType]"varchar")
$Proccmd.Connection = $SqlConnection
 
$EmailCmd = New-Object System.Data.SqlClient.SqlCommand
$EmailCmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$EmailCmd.CommandText = "dbo.EmailInfo"
$EmailCmd.Parameters.Add("@EmailAddress",[System.Data.SqlDbType]"varchar")
$EmailCmd.Connection = $SqlConnection
 
$SalesCmd = New-Object System.Data.SqlClient.SqlCommand
$SalesCmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$SalesCmd.CommandText = "dbo.SalesInfo"
$SalesCmd.Parameters.Add("@FirstName",[System.Data.SqlDbType]"varchar")
$SalesCmd.Connection = $SqlConnection
 
$OddCmd = New-Object System.Data.SqlClient.SqlCommand
$OddCmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$OddCmd.CommandText = "dbo.OddName"
$OddCmd.Parameters.Add("@FirstName",[System.Data.SqlDbType]"varchar")
$OddCmd.Connection = $SqlConnection
 
 
 
while(1 -ne 0)
{
    foreach($row in $DatDataSet.Tables[0])
        {
        
        $name = $row[0]
        $email = $row[1]
        $SqlConnection.Open()
        $Proccmd.Parameters["@FirstName"].Value = $name
        $Proccmd.ExecuteNonQuery() | Out-Null
        $EmailCmd.Parameters["@EmailAddress"].Value = $email
        $EmailCmd.ExecuteNonQuery() | Out-Null
        $SalesCmd.Parameters["@FirstName"].Value = $name
        $SalesCmd.ExecuteNonQuery() | Out-Null
        $OddCmd.Parameters["@FirstName"].Value = $name
        $OddCmd.ExecuteNonQuery() | Out-Null
        $SqlConnection.Close()
 
 }
 }

Create the procedures. Run this script for 12-18 hours, and you should get something that looks like this in your Recommendations blade:

Conclusion

The goal here is really simple. I want automatic indexing to fire. I’ve done just enough work to make that happen. Automatic indexing also has a drop option for duplicates. I haven’t yet figured out how to make that happen, but you get the start of that in these scripts, too.

Discover Tarantool's unique features such as powerful stored procedures, SQL support, smart cache, and the speed of 1 million ACID transactions on a single CPU.

Topics:
database ,sql ,azure sql ,tutorial ,automation ,indexing

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}