Extended Events, the system_health Session, and Waits
Extended Events, the system_health Session, and Waits
Let's talk about a common situation that you're going to run into on your servers all the time and how you can put Extended Events to work to help you.
Join the DZone community and get the full member experience.Join For Free
RavenDB vs MongoDB: Which is Better? This White Paper compares the two leading NoSQL Document Databases on 9 features to find out which is the best solution for your next project.
I advocate for, use, document, teach, and just downright love, Extended Events. They are so much better than the old Trace Events (aka, Profiler) that it's sometimes difficult to keep from just gushing. Let's talk about a common situation that you're going to run into on your servers all the time and how you can put Extended Events to work to help you without actually doing any work at all.
What's that? Be lazy and get rewards? Yes.
The Extended Events system_health Session
On your servers, any of them that are SQL Server 2008 or newer, right now, unless you've performed actions to prevent this, you're running the Extended Events system_health session. It's just happening, currently, on all your servers. Nothing you need to do about it at all.
If you follow the link, you can see all the various types of information being gathered by the Extended Event system_health session. I won't detail all of it here, let me just provide a little context around how the session works. First and foremost, similar to the error log, this session consists of four files, each 5mb in size, rolling over as they get filled. For systems with a very high degree of activity, that means the information here may only be hours old. However, for most of us, this provides days, if not weeks worth of information about the behavior of your system.
You'll need to know where the files are located so that we can query them or open them up in the Live Data window. Here's a simple query that will give you the path on any system:
SELECT dosdlc.path FROM sys.dm_os_server_diagnostics_log_configurations AS dosdlc;
Now there are a ton of reasons why you should be taking advantage of the system_health session (deadlocks for example), but I'm going to focus on one, blocks.
Waits In The system_health Session
To see this information in action, we can set up a really simple query. Run this from two different query windows and let it sit for about 45 seconds or so (it only needs 30, but for our purposes, add a little padding):
BEGIN TRAN UPDATE dbo.ErrorLog SET ErrorTime = GETDATE() WHERE ErrorLogID = 1;
After a little while, roll back both scripts. Then, we can run this script to take a look at the system_health information specifically on the waits caused by these two queries blocking one another:
DECLARE @path NVARCHAR(260); --to retrieve the local path of system_health files SELECT @path = dosdlc.path FROM sys.dm_os_server_diagnostics_log_configurations AS dosdlc; SELECT @path = @path + N'system_health_*'; SELECT CAST(fx.event_data AS XML) AS Event_Data, fx.object_name FROM sys.fn_xe_file_target_read_file(@path, NULL, NULL, NULL) AS fx WHERE fx.object_name = 'wait_info';
The results on my system look like this:
<event name="wait_info" package="sqlos" timestamp="2018-06-01T12:04:18.915Z"> <data name="wait_type"> <value>5</value> <text>LCK_M_X</text> </data> <data name="opcode"> <value>1</value> <text>End</text> </data> <data name="duration"> <value>92115</value> </data> <data name="signal_duration"> <value>0</value> </data> <data name="wait_resource"> <value>0x0000002c2d3fb690</value> </data> <action name="sql_text" package="sqlserver"> <value>BEGIN TRAN UPDATE dbo.ErrorLog SET ErrorTime = GETDATE() WHERE ErrorLogID = 1; </value> </action> <action name="session_id" package="sqlserver"> <value>71</value> </action> <action name="callstack" package="package0"> <value>sqllang.dll+0x000000000053FF82 sqllang.dll+0x0000000000541241 sqllang.dll+0x0000000000540669 sqldk.dll+0x00000000001055E5 sqldk.dll+0x00000000000025E1 sqldk.dll+0x000000000000148F sqldk.dll+0x000000000000354B sqlmin.dll+0x0000000000067449 sqlmin.dll+0x000000000006720B sqlmin.dll+0x000000000000CF48 sqlmin.dll+0x000000000001391E sqlmin.dll+0x0000000000072EA6 sqlmin.dll+0x000000000002E7AF sqlmin.dll+0x000000000002E595 sqlmin.dll+0x0000000000031786 sqlTsEs.dll+0x0000000000001F17 sqlmin.dll+0x0000000000059010 sqlmin.dll+0x000000000003F6BF sqllang.dll+0x000000000001817B sqllang.dll+0x000000000002B86F sqllang.dll+0x000000000002B61F sqllang.dll+0x0000000000B8BFDB sqllang.dll+0x0000000000B7D7CB sqllang.dll+0x0000000000013A74</value> </action> </event>
You have the query, the wait_resource, the duration, and in short, everything you need to start identifying why you had excessive blocking on one of your servers, and all you had to do was reach out and grab it.
I'll leave parsing the XML to others (check out this book, just starting to read it).
Really, there was a query that ran unusually long yesterday? I wonder if it was blocked and waiting on resources? Let me take a quick look at the Extended Events system_health session. It really is that easy. This is a free, easily accessed resource that is available to you right now.
Published at DZone with permission of Grant Fritchey , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.