Automating Automatic Indexing in Azure SQL Database
The goal here is actually really simple. I just want automatic indexing to fire. Here, I do *just enough* work to make that happen.
Join the DZone community and get the full member experience.
Join For FreeI’ve been in love with the concept of a database-as-a-service ever since I first laid eyes on Azure SQL Database. It just makes sense to me. Take away the mechanics of server management and database management. Focus on the guts of your database. Backups, consistency checks — these easily automated aspects can just be taken care of. The same thing goes with some, not all, but some, index management. Azure SQL Database can manage your indexes for you. I call it weaponizing Query Store.
Anyway, I needed a way to automate this for the book I’m writing. I couldn’t find any good examples online, so I built my own.
Queries in Need of Automatic Indexing
Because I want this to be as simple and repeatable as possible, I’m using the sample database that you can create when you create an Azure SQL Database. It’s basically a very tiny subset of AdventureWorks, simplified and reduced in size. With that as a starting point, you need some queries:
CREATE OR ALTER PROCEDURE dbo.CustomerInfo
(@Firstname NVARCHAR(50))
AS
SELECT c.FirstName,
c.LastName,
c.Title,
a.City
FROM SalesLT.Customer AS c
JOIN SalesLT.CustomerAddress AS ca
ON ca.CustomerID = c.CustomerID
JOIN SalesLT.Address AS a
ON a.AddressID = ca.AddressID
WHERE c.FirstName = @Firstname;
GO
CREATE INDEX TestDuplicate ON SalesLT.Customer (EmailAddress)
GO
CREATE OR ALTER PROCEDURE dbo.EmailInfo (@EmailAddress nvarchar(50))
AS
SELECT c.EmailAddress,
c.Title,
soh.OrderDate
FROM SalesLT.Customer AS c
JOIN SalesLT.SalesOrderHeader AS soh
ON soh.CustomerID = c.CustomerID
WHERE c.EmailAddress = @EmailAddress;
GO
CREATE OR ALTER PROCEDURE dbo.SalesInfo (@firstName NVARCHAR(50))
AS
SELECT c.FirstName,
c.LastName,
c.Title,
soh.OrderDate
FROM SalesLT.Customer AS c
JOIN SalesLT.SalesOrderHeader AS soh
ON soh.CustomerID = c.CustomerID
WHERE c.FirstName = @firstName
GO
CREATE OR ALTER PROCEDURE dbo.OddName (@FirstName NVARCHAR(50))
AS
SELECT c.FirstName
FROM SalesLT.Customer AS c
WHERE c.FirstName BETWEEN 'Brian'
AND @FirstName
GO
What you’ll notice is that several of the queries are filtering on the FirstName
column. There’s no good index there. If you look at the execution plans for those queries, you’ll also note the missing index suggestion. That suggestion is a necessary part of the automatic indexing. Yeah, missing indexes. I know. They’re not always accurate. It’s just a suggestion. Blah, blah, blah. I hear you.
The magic is not supplied by missing indexes. The magic is supplied by lots of data. Microsoft can take advantage of three things. Yes, missing index suggestions is first. Then, they can use the query metrics gathered in Query Store to see the behavior of your queries over time. Finally, they can use machine learning algorithms to determine if indexes will be helpful and measure how helpful they’ve been if one gets added. It’s great stuff. Go and read on it.
Automating Queries for Automatic Indexing
The key to getting your indexes automatically added though is time. You can’t run one query once, get a missing index, and Microsoft will add it for you. They’re not crazy or stupid. You need a lot of executions (exact number unknown to me). There has to be a lot of executions over time (approximate time based on testing, 12-18 hours). Finally, you also need more than one query (again, how many, I don’t know, but having only one or two did not cross the threshold, so I went for four, which did).
So, how did I run these procs for hours? Easy. Powershell:
$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = 'Server=qpf.database.windows.net;Database=QueryPerformanceTuning;trusted_connection=false;user=Grant;password=icanttellyou'
## load customer names
$DatCmd = New-Object System.Data.SqlClient.SqlCommand
$DatCmd.CommandText = "SELECT c.FirstName, c.EmailAddress
FROM SalesLT.Customer AS c;"
$DatCmd.Connection = $SqlConnection
$DatDataSet = New-Object System.Data.DataSet
$SqlAdapter = New-Object System.Data.SqlClient.SqlDataAdapter
$SqlAdapter.SelectCommand = $DatCmd
$SqlAdapter.Fill($DatDataSet)
$Proccmd = New-Object System.Data.SqlClient.SqlCommand
$Proccmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$Proccmd.CommandText = "dbo.CustomerInfo"
$Proccmd.Parameters.Add("@FirstName",[System.Data.SqlDbType]"varchar")
$Proccmd.Connection = $SqlConnection
$EmailCmd = New-Object System.Data.SqlClient.SqlCommand
$EmailCmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$EmailCmd.CommandText = "dbo.EmailInfo"
$EmailCmd.Parameters.Add("@EmailAddress",[System.Data.SqlDbType]"varchar")
$EmailCmd.Connection = $SqlConnection
$SalesCmd = New-Object System.Data.SqlClient.SqlCommand
$SalesCmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$SalesCmd.CommandText = "dbo.SalesInfo"
$SalesCmd.Parameters.Add("@FirstName",[System.Data.SqlDbType]"varchar")
$SalesCmd.Connection = $SqlConnection
$OddCmd = New-Object System.Data.SqlClient.SqlCommand
$OddCmd.CommandType = [System.Data.CommandType]'StoredProcedure'
$OddCmd.CommandText = "dbo.OddName"
$OddCmd.Parameters.Add("@FirstName",[System.Data.SqlDbType]"varchar")
$OddCmd.Connection = $SqlConnection
while(1 -ne 0)
{
foreach($row in $DatDataSet.Tables[0])
{
$name = $row[0]
$email = $row[1]
$SqlConnection.Open()
$Proccmd.Parameters["@FirstName"].Value = $name
$Proccmd.ExecuteNonQuery() | Out-Null
$EmailCmd.Parameters["@EmailAddress"].Value = $email
$EmailCmd.ExecuteNonQuery() | Out-Null
$SalesCmd.Parameters["@FirstName"].Value = $name
$SalesCmd.ExecuteNonQuery() | Out-Null
$OddCmd.Parameters["@FirstName"].Value = $name
$OddCmd.ExecuteNonQuery() | Out-Null
$SqlConnection.Close()
}
}
Create the procedures. Run this script for 12-18 hours, and you should get something that looks like this in your Recommendations blade:
Conclusion
The goal here is really simple. I want automatic indexing to fire. I’ve done just enough work to make that happen. Automatic indexing also has a drop option for duplicates. I haven’t yet figured out how to make that happen, but you get the start of that in these scripts, too.
Published at DZone with permission of Grant Fritchey, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments