Azure SQL Elastic Jobs
In this post, we take an in-depth look at how to make use of Azure's SQL Elastic Jobs feature. Read on for the details.
Join the DZone community and get the full member experience.Join For Free
I'm presenting a session this weekend at SQL Saturday Manchester on how to deal with lack of SQL agent when moving to Azure SQL. This session looks at options in Azure for running regular jobs against SQL databases and usually focuses on using Azure Automation and Azure functions. However, just a couple of weeks before the event Microsoft released a new service that could also be used to fill this gap, SQL Elastic Jobs, and so I needed to squeeze that into my presentation last minute. While preparing my demos for the event, I came across some gotchas, so that is what we are going to look at this week.
You may have heard of Elastic Jobs before; there was an old Azure Elastic Jobs service that has been around for a few years; however, this service wasn't terribly useful. It was mainly aimed at dealing with sharded databases and required you to run things on premises. This service is being replaced by this new service, which runs wholely in Azure.
What Azure Elastic Jobs allows you to do is to write your DB maintenance tasks in T-SQL and store them in a job agent, and then have the agent run your job across multiple Servers, Databases and elastic pools, including databases in other subscriptions. You can run jobs as a one-off or on a schedule.
Before we get into how it all works, there are some limitations to be aware of before you decide to use this service:
- It's currently in preview, having been announced a few weeks ago
- Documentation at this time is relatively limited. There are docs on how to do the necessary tasks, but if you want to do anything more advanced you have to figure it out yourself
- The preview is limited to running 100 concurrent jobs
- The administration is only through PowerShell or T-SQL. You can view jobs in the portal, but not interact with them
- Jobs run using the same credentials on all targets, so this account must exist in the target databases, with permissions to do the actions you require
Scale and Cost
Elastic jobs are based on the concept of a Job agent; this is just an Azure SQL database that you have transformed into a job agent. This means that the performance, scale, and cost of Elastic Jobs are entirely based on your choice of SQL database for your job agent. There is no additional cost for the service, just the DB cost.
The job agent DB must be at least an S0, basic SKU is not support, and S1 or above is recommended.
As mentioned, administration of this service is through PowerShell or T-SQL only. We are going to focus on using Powershell, as we can do everything with this if you use T-SQL. Some steps still have to be done in PowerShell.
Because this service is in preview the PowerShell cmdlets you need are not included in standard Azure Powershell; you need the preview version, and at the moment the preview version you need has a version number lower than the current GA version, which caused some issues with the instructions previously. To install this, you need to configure PowerShell to get pre-release version and then install the specific version we need. You need to run the following commands as an administrator.
Find-Package PackageManagement -RequiredVersion 220.127.116.11 | Install-Package -Force Find-Package PowerShellGet -RequiredVersion 1.6.5 | Install-Package -Force Install-Module -Name AzureRM.Sql -AllowPrerelease -RequiredVersion 4.8.1-preview -Force Import-Module AzureRM.Sql -RequiredVersion 4.8.1
Once that is done you should be able to run the command below and not get an error:
You also need to enable your subscription for the preview by running the following command:
Register-AzureRmProviderFeature -FeatureName sqldb-JobAccounts -ProviderNamespace Microsoft.Sql
It can take a while for the registration to complete, you can check on it by running this command:
Once the status comes back as "Registered", you are ready to go.
Elastic Job Setup
Now that we have the pre-requisites setup we can go ahead and start setting up our elastic jobs, this involves the following steps:
- Creating a new, blank database to uses as our Job Agent
- Creating the Elastic Job Agent
- Creating Credentials - The Elastic Job agent needs to have access to credentials to be able to access the databases it will act on. If you're referencing a server rather than a DB, then it needs credentials to be able to list the databases on the server as well
- Create a Target - The target groups together databases, server and elastic pools which you want to be able to run a job over.
- Create a Job - this is the container for the tasks we want to run and references the credentials that will be used
- Create tasks - these are the individual work items you will run over the database and contain your T-SQL commands
- Run or Schedule your job
The Microsoft documentation on completing these steps is pretty comprehensive, so I won't rehash it here. Follow these steps and then come back to look at some more advanced items
The jobs described in the Microsoft docs are all acting on the target to make changes. However, it is also possible to extract data from the targets. In the diagram above, this where the output database comes in. The output database is just an ordinary Azure SQL database, where the job agent can take any response from the target databases and write this to the output database.
In this example, we will look at running a query on all databases that gets the amount of fragmentation in each table and records this. The first thing we need to do is write our query that will get this data. We need to bear in mind that the results of this query will be written to the same table in output database for each target, so the query must have a unique reference to the target. In this example, we use the DB name.
SELECT DB_NAME() AS [Current Database], SYSDATETIME() as [QueryDate], dbschemas.[name] as 'Schema', dbtables.[name] as 'Table', dbindexes.[name] as 'Index', indexstats.avg_fragmentation_in_percent, indexstats.page_count FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS indexstats INNER JOIN sys.tables dbtables on dbtables.[object_id] = indexstats.[object_id] INNER JOIN sys.schemas dbschemas on dbtables.[schema_id] = dbschemas.[schema_id] INNER JOIN sys.indexes AS dbindexes ON dbindexes.[object_id] = indexstats.[object_id] AND indexstats.index_id = dbindexes.index_id WHERE indexstats.database_id = DB_ID() ORDER BY indexstats.avg_fragmentation_in_percent desc
Now we need to create our job; this is the same as the previous jobs we created
$jobAgent= Get-AzureRmSqlElasticJobAgent -ResourceGroupName "SQLAgentDemos" -ServerName "SQLAgentDemo" -Name "DemoJobAgent" $serverGroupName="ServerGroup1" $credentialName="jobuser" Write-Output "Creating a new job" $JobName = "GetFragmentation" $Job = $JobAgent | New-AzureRmSqlElasticJob -Name $JobName -RunOnce
Where this differs is when creating the job task. Here we need to specify some extra parameters to connect to the output database:
OutputDatabaseObject- This is the Azure SQL database that stores the output data. This command expects an actual Azure SQL Object, so you need to use get-AzureRMSQLDatabase command to get this
OutputCredentialName- The credentials to use to write to the output DB. This command expects an elastic job credential
OutputTableName- The name of the table to write the data to
OutputSchemaName- The DB schema the table is in
$SQLCommandString = @" SELECT DB_NAME() AS [Current Database], SYSDATETIME() as [QueryDate], dbschemas.[name] as 'Schema', dbtables.[name] as 'Table', dbindexes.[name] as 'Index', indexstats.avg_fragmentation_in_percent, indexstats.page_count FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS indexstats INNER JOIN sys.tables dbtables on dbtables.[object_id] = indexstats.[object_id] INNER JOIN sys.schemas dbschemas on dbtables.[schema_id] = dbschemas.[schema_id] INNER JOIN sys.indexes AS dbindexes ON dbindexes.[object_id] = indexstats.[object_id] AND indexstats.index_id = dbindexes.index_id WHERE indexstats.database_id = DB_ID() ORDER BY indexstats.avg_fragmentation_in_percent desc "@ $outputDatabase=Get-AzureRmSqlDatabase -DatabaseName "OutputDatabase" -ResourceGroupName "SQLAgentDemos" -ServerName "SQLAgentDemo" $Job | Add-AzureRmSqlElasticJobStep -Name "step1" -TargetGroupName $serverGroupName -CredentialName $credentialName -CommandText $SQLCommandString -OutputDatabaseObject $outputDatabase -OutputCredentialName $credentialName -OutputTableName "Fragmentation" -OutputSchemaName "dbo"
When we run this job, it queries all the targets and writes the data to the output DB. If the account you are running the job as has rights it can create the output table for you if it does not exist, this is the easiest option. If that is not possible, you can manually create the table to match the results of your query, but it must also include an additional column called
internal_execution_id of type
If your jobs fail, you can get some debug information from the job database. Using SQL management studio, you can connect to the database and open the
job_task_executions table. This table shows each task execution, and if there are errors, they will be listed in this table. You can also look at the job_executions table to get the overall status of a job.
When a job fails it will retry a number of times before it fails completely.
Elastic jobs is a useful addition to the toolset that can be used to replace the SQL agent in Azure. It's still in preview and has some issues and limitations at the moment. In some respects being limited to T-SQL does have some limitations over something like Azure automation, especially if you want to manipulate other resources alongside Azure SQL, but as a pure replacement for SQL agent, it does show promise.
Published at DZone with permission of Sam Cogan, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Introduction To Git
Scaling Site Reliability Engineering (SRE) Teams the Right Way
Replacing Apache Hive, Elasticsearch, and PostgreSQL With Apache Doris
Never Use Credentials in a CI/CD Pipeline Again