Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

system_health Extended Events in Azure SQL Database

DZone's Guide to

system_health Extended Events in Azure SQL Database

Explore system_health extended events in Azure SQL database.

· Database Zone ·
Free Resource

Read the 2019 State of Database DevOps Report for the very latest insights

The system_health Extended Events session is incredibly useful. Further, it's running, by default, in every server you have under management that is 2008 or greater. Things are not the same in Azure though.Image title

system_health in Azure SQL Database

If you look at the documentation for system_health, it shows that it's applicable to Azure SQL Database. However, if you try to run the example query, it won't work. This is because the implementation of Extended Events inside Azure SQL Database is a little different. Instead, you need to use the Azure SQL Database equivalent system views to create the same query like this:

SELECT Cast(dxdst.target_data AS XML) 
FROM   sys.dm_xe_database_session_targets AS dxdst 
       JOIN sys.dm_xe_database_sessions AS dxds 
         ON dxds.address = dxdst.event_session_address 
WHERE  dxds.NAME = 'system_health'; 

Now, running this in Azure, prepare to be disappointed. While the system_health documentation says it applies to Azure SQL Database, there is not a system_health session there.

Can we make one?

Recreating system_health in Azure SQL Database

Referring back to the documentation, the u_tables.sql file in the Install directory of your on-premises SQL Server instance has the script needed for recreating the system_health session.

Of course, we can't simply run this as is on Azure SQL Database because one of the two targets is to your LOG folder. No such thing in Azure. First thing I did was remove that target. Then, you can't use the 'SERVER' keyword in the commands, so I replaced that with 'DATABASE'. Running the script resulted in an error:

Msg 25744, Level 16, State 1, Line 20
The action 'package0.callstack' is not available for Azure SQL Database.

OK. Time for some quick research. So, the call stack is of a lot more interest to Microsoft when debugging than it is to me, so I'm going to strip this action and see what we get next. After a bunch of trials and errors, I ended up with this script:

IF EXISTS (   SELECT *               FROM sys.dm_xe_database_sessions AS dxds      
WHERE dxds.name = 'system_health')    
  DROP EVENT SESSION system_health ON SERVER; 
GO   
CREATE EVENT SESSION system_health ON DATABASE     
ADD EVENT sqlserver.error_reported     
(ACTION (sqlserver.session_id,              
         sqlserver.database_id,              
         sqlserver.sql_text,              
         sqlserver.tsql_stack)      
 -- Get callstack, SPID, and query for all high severity errors ( above sev 20 )      
 WHERE severity >= 20            -- Get callstack, SPID, and query for OOM errors 
 ( 17803 , 701 , 802 , 8645 , 8651 , 8657 , 8902 ),
 Hekaton checkpoint/merge errors (41354, 41355, 41367, 41384),
 Hekaton compilation related errors (41336, 41309, 41312, 41313)            
 OR (   ERROR_NUMBER = 17803  OR ERROR_NUMBER = 701        
     OR ERROR_NUMBER = 802   OR ERROR_NUMBER = 8645  
     OR ERROR_NUMBER = 8651  OR ERROR_NUMBER = 8657 
     OR ERROR_NUMBER = 8902   OR ERROR_NUMBER = 41354  
     OR ERROR_NUMBER = 41355   OR ERROR_NUMBER = 41367  
     OR ERROR_NUMBER = 41384   OR ERROR_NUMBER = 41336  
     OR ERROR_NUMBER = 41309   OR ERROR_NUMBER = 41312 OR ERROR_NUMBER = 41313)), 
--    ADD EVENT sqlserver.xml_deadlock_report,    
ADD EVENT sqlos.wait_info     (ACTION (sqlserver.session_id,             
                                       sqlserver.sql_text)      
                               WHERE (   duration > 15000               
                                      AND (   (   wait_type >= N'LATCH_NL' 
                                               -- Waits for latches and important wait resources (not locks ) that have exceeded 15 seconds.                            AND (   (   wait_type >= N'PAGELATCH_NL'                                        
                                               AND wait_type <= N'PAGELATCH_DT') --PAGELATCH_NL;PAGELATCH_KP;PAGELATCH_SH;PAGELATCH_UP;PAGELATCH_EX;PAGELATCH_DT                                    
                                               OR (wait_type <= N'LATCH_DT') --LATCH_NL;LATCH_KP;LATCH_SH;LATCH_UP;LATCH_EX;LATCH_DT                           
                                               OR (   wait_type >= N'PAGEIOLATCH_NL'                                           
                                                   AND wait_type <= N'PAGEIOLATCH_DT') --PAGEIOLATCH_NL;PAGEIOLATCH_KP;PAGEIOLATCH_SH;PAGEIOLATCH_UP;PAGEIOLATCH_EX;PAGEIOLATCH_DT                                    OR (   wait_type >= N'IO_COMPLETION'                                           AND wait_type <= N'NETWORK_IO') --IO_COMPLETION;ASYNC_IO_COMPLETION;NETWORK_IO                                   
                                                   OR (wait_type = N'RESOURCE_SEMAPHORE')                                    
                                                   OR (wait_type = N'SOS_WORKER')                                    
                                                   OR (   wait_type >= N'FCB_REPLICA_WRITE'                                          
                                                       AND wait_type <= N'WRITELOG') --FCB_REPLICA_WRITE;FCB_REPLICA_READ;WRITELOG                                    
                                                       OR (wait_type = N'CMEMTHREAD')                                    
                                                       OR (wait_type = N'TRACEWRITE')                                    
                                                       OR (wait_type = N'RESOURCE_SEMAPHORE_MUTEX')))                       
                                           OR (   duration > 30000 -- Waits for locks that have exceeded 30 secs.                               AND wait_type <= N'LCK_M_RX_X' -- all lock waits                        )))),     ADD EVENT sqlos.wait_info_external     (ACTION (sqlserver.session_id,              sqlserver.sql_text)      WHERE (   duration > 5000                AND (   ( -- Login related preemptive waits that have exceeded 5 seconds.                            (   wait_type >= N'PREEMPTIVE_OS_GENERICOPS'                                AND wait_type <= N'PREEMPTIVE_OS_ENCRYPTMESSAGE') --PREEMPTIVE_OS_GENERICOPS;PREEMPTIVE_OS_AUTHENTICATIONOPS;PREEMPTIVE_OS_ACCEPTSECURITYCONTEXT;PREEMPTIVE_OS_ACQUIRECREDENTIALSHANDLE;PREEMPTIVE_OS_COMPLETEAUTHTOKEN;PREEMPTIVE_OS_DECRYPTMESSAGE;PREEMPTIVE_OS_DELETESECURITYCONTEXT;PREEMPTIVE_OS_ENCRYPTMESSAGE                            OR (   wait_type >= N'PREEMPTIVE_OS_INITIALIZESECURITYCONTEXT'                                   AND wait_type <= N'PREEMPTIVE_OS_QUERYSECURITYCONTEXTTOKEN') --PREEMPTIVE_OS_INITIALIZESECURITYCONTEXT;PREEMPTIVE_OS_LOGONUSER;PREEMPTIVE_OS_QUERYSECURITYCONTEXTTOKEN                            OR (   wait_type >= N'PREEMPTIVE_OS_AUTHZGETINFORMATIONFROMCONTEXT'                                   AND wait_type <= N'PREEMPTIVE_OS_REVERTTOSELF') --PREEMPTIVE_OS_AUTHZGETINFORMATIONFROMCONTEXT;PREEMPTIVE_OS_AUTHZINITIALIZECONTEXTFROMSID;PREEMPTIVE_OS_AUTHZINITIALIZERESOURCEMANAGER;PREEMPTIVE_OS_LOOKUPACCOUNTSID;PREEMPTIVE_OS_REVERTTOSELF                            OR (   wait_type >= N'PREEMPTIVE_OS_CRYPTACQUIRECONTEXT'                                   AND wait_type <= N'PREEMPTIVE_OS_DEVICEOPS') --PREEMPTIVE_OS_CRYPTACQUIRECONTEXT;PREEMPTIVE_OS_CRYPTIMPORTKEY;PREEMPTIVE_OS_DEVICEOPS                            OR (   wait_type >= N'PREEMPTIVE_OS_NETGROUPGETUSERS'                                   AND wait_type <= N'PREEMPTIVE_OS_NETUSERMODALSGET') --PREEMPTIVE_OS_NETGROUPGETUSERS;PREEMPTIVE_OS_NETLOCALGROUPGETMEMBERS;PREEMPTIVE_OS_NETUSERGETGROUPS;PREEMPTIVE_OS_NETUSERGETLOCALGROUPS;PREEMPTIVE_OS_NETUSERMODALSGET                            OR (   wait_type >= N'PREEMPTIVE_OS_NETVALIDATEPASSWORDPOLICYFREE'                                   AND wait_type <= N'PREEMPTIVE_OS_DOMAINSERVICESOPS') --PREEMPTIVE_OS_NETVALIDATEPASSWORDPOLICYFREE;PREEMPTIVE_OS_DOMAINSERVICESOPS                            OR (wait_type = N'PREEMPTIVE_OS_VERIFYSIGNATURE'))                        OR (   duration > 45000 -- Preemptive OS waits that have exceeded 45 seconds.                               AND (   (   wait_type >= N'PREEMPTIVE_OS_SETNAMEDSECURITYINFO'                                           AND wait_type <= N'PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL') --PREEMPTIVE_OS_SETNAMEDSECURITYINFO;PREEMPTIVE_OS_CLUSTEROPS;PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL                                       OR (   wait_type >= N'PREEMPTIVE_OS_RSFXDEVICEOPS'                                              AND wait_type <= N'PREEMPTIVE_OS_DSGETDCNAME') --PREEMPTIVE_OS_RSFXDEVICEOPS;PREEMPTIVE_OS_DIRSVC_NETWORKOPS;PREEMPTIVE_OS_DSGETDCNAME                                       OR (   wait_type >= N'PREEMPTIVE_OS_DTCOPS'                                              AND wait_type <= N'PREEMPTIVE_DTC_ABORT') --PREEMPTIVE_OS_DTCOPS;PREEMPTIVE_DTC_ABORT                                       OR (   wait_type >= N'PREEMPTIVE_OS_CLOSEHANDLE'                                              AND wait_type <= N'PREEMPTIVE_OS_FINDFILE') --PREEMPTIVE_OS_CLOSEHANDLE;PREEMPTIVE_OS_COPYFILE;PREEMPTIVE_OS_CREATEDIRECTORY;PREEMPTIVE_OS_CREATEFILE;PREEMPTIVE_OS_DELETEFILE;PREEMPTIVE_OS_DEVICEIOCONTROL;PREEMPTIVE_OS_FINDFILE                                       OR (   wait_type >= N'PREEMPTIVE_OS_GETCOMPRESSEDFILESIZE'                                              AND wait_type <= N'PREEMPTIVE_ODBCOPS') --PREEMPTIVE_OS_GETCOMPRESSEDFILESIZE;PREEMPTIVE_OS_GETDISKFREESPACE;PREEMPTIVE_OS_GETFILEATTRIBUTES;PREEMPTIVE_OS_GETFILESIZE;PREEMPTIVE_OS_GETLONGPATHNAME;PREEMPTIVE_OS_GETVOLUMEPATHNAME;PREEMPTIVE_OS_GETVOLUMENAMEFORVOLUMEMOUNTPOINT;PREEMPTIVE_OS_MOVEFILE;PREEMPTIVE_OS_OPENDIRECTORY;PREEMPTIVE_OS_REMOVEDIRECTORY;PREEMPTIVE_OS_SETENDOFFILE;PREEMPTIVE_OS_SETFILEPOINTER;PREEMPTIVE_OS_SETFILEVALIDDATA;PREEMPTIVE_OS_WRITEFILE;PREEMPTIVE_OS_WRITEFILEGATHER;PREEMPTIVE_OS_LIBRARYOPS;PREEMPTIVE_OS_FREELIBRARY;PREEMPTIVE_OS_GETPROCADDRESS;PREEMPTIVE_OS_LOADLIBRARY;PREEMPTIVE_OS_MESSAGEQUEUEOPS;PREEMPTIVE_ODBCOPS                                       OR (   wait_type >= N'PREEMPTIVE_OS_DISCONNECTNAMEDPIPE'                                              AND wait_type <= N'PREEMPTIVE_CLOSEBACKUPMEDIA') --PREEMPTIVE_OS_DISCONNECTNAMEDPIPE;PREEMPTIVE_OS_PROCESSOPS;PREEMPTIVE_OS_SECURITYOPS;PREEMPTIVE_OS_SERVICEOPS;PREEMPTIVE_OS_SQLCLROPS;PREEMPTIVE_OS_WINSOCKOPS;PREEMPTIVE_OS_GETADDRINFO;PREEMPTIVE_OS_WSASETLASTERROR;PREEMPTIVE_OS_FORMATMESSAGE;PREEMPTIVE_OS_REPORTEVENT;PREEMPTIVE_OS_BACKUPREAD;PREEMPTIVE_OS_WAITFORSINGLEOBJECT;PREEMPTIVE_OS_QUERYREGISTRY;PREEMPTIVE_CLOSEBACKUPMEDIA                                       OR wait_type = N'PREEMPTIVE_OS_AUTHENTICATIONOPS'                                       OR wait_type = N'PREEMPTIVE_OS_FREECREDENTIALSHANDLE'                                       OR wait_type = N'PREEMPTIVE_OS_AUTHORIZATIONOPS'                                       OR wait_type = N'PREEMPTIVE_COM_COCREATEINSTANCE'                                       OR wait_type = N'PREEMPTIVE_OS_NETVALIDATEPASSWORDPOLICY'                                       OR wait_type = N'PREEMPTIVE_VSS_CREATESNAPSHOT')))))     /*    add target package0.event_file      -- Store events on disk (in the LOG folder of the instance)     (         set filename           = N'system_health.xel',             max_file_size      = 5, /* MB */             max_rollover_files = 4     ),*/     ADD TARGET package0.ring_buffer -- Store events in the ring buffer target     (SET MAX_MEMORY = 4096, max_events_limit = 5000) WITH (MAX_DISPATCH_LATENCY = 120 SECONDS,       STARTUP_STATE = ON); GO   IF NOT EXISTS (   SELECT *                   FROM sys.dm_xe_database_sessions AS dxds                   WHERE dxds.name = 'system_health')     ALTER EVENT SESSION system_health ON DATABASE STATE = START; GO

That gets me a bunch of the interesting stuff from system_health, but it is missing one thing that I left in the script, just commented out; xml_deadlock_report. That sucks just a little. Although, there is already a way to capture deadlocks in Azure SQL Database, so it's not necessarily needed in your new, fake, system_health session.

Conclusion

The short answer to, "Is there system_health in Azure SQL Database" question is no. At this time, that does not exist. However, can we at least simulate it with a little work? Yeah, but I did not spend a lot of time investigating what all the events that I eliminated (almost all related to buffers of one sort or another) are causing me to lose. What I did get, though, are the long-running queries, waits, errors, and other stuff on display within the system_health extended events session. It's not everything, but it's a start.

If you want to learn more about how to use Extended Events and other tools built into SQL Server that will help you identify and tune queries easier and faster, I have some all-day seminars coming up on the topic:

SQL Day, May 13-15, Wroclaw, Poland
SQLSaturday Columbus Precon, June 7, Columbus OH

Read the 2019 State of Database DevOps Report for latest insights into DevOps adoption among SQL Server professionals, and the benefits and challenges of including the database in DevOps initiatives

Topics:
database ,performance ,azure sql database ,tutorial ,extended events

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}