CREATE EVENT SESSION [QueryTimeouts] ON SERVER ADD EVENT sqlserver.rpc_completed(SET collect_statement=(1) ACTION(sqlserver.database_name,sqlserver.username) WHERE ([result]=(2))), ADD EVENT sqlserver.sql_batch_completed( ACTION(sqlserver.database_name,sqlserver.username) WHERE ([result]=(2))) ADD TARGET package0.asynchronous_file_target (SET filename = 'G:\QueryTimeouts.xel', max_file_size = 5, max_rollover_files = 2) WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS)
The data can be extracted with following query:
DECLARE @filename NVARCHAR(max) SELECT @filename = REPLACE(CAST(sesf.value AS NVARCHAR(MAX)), '.xel', '') FROM sys.server_event_session_fields sesf JOIN sys.server_event_sessions ses ON ses.event_session_id = sesf.event_session_id WHERE sesf.name = 'filename' AND ses.name = 'QueryTimeouts' ;WITH cte AS (SELECT CONVERT(XML, event_data) ed FROM sys.fn_xe_file_target_read_file(@filename + '*.xel', NULL, NULL, NULL) ) SELECT DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), GETDATE()), ed.value('(event/@timestamp)[1]', 'datetime2(3)')) AS [timestamp], ed.value('(event/action[@name="database_name"]/value)[1]', 'nvarchar(128)') AS [database_name], ed.value('(event/action[@name="username"]/value)[1]', 'nvarchar(128)') AS username, ed.value('(event/data[@name="duration"]/value)[1]', 'bigint')/1000 AS duration, ed.value('(event/data[@name="cpu_time"]/value)[1]', 'bigint')/1000 AS cpu_time, ed.value('(event/data[@name="logical_reads"]/value)[1]', 'bigint') AS logical_reads, ISNULL(ed.value('(event/data[@name="spills"]/value)[1]', 'bigint'),0) AS spills, ISNULL(ed.value('(event/data[@name="batch_text"]/value)[1]', 'nvarchar(max)'),ed.value('(event/data[@name="statement"]/value)[1]', 'nvarchar(max)')) AS text FROM cte ORDER BY timestamp;
There are few gotchas when using this event. First you want to filter out non application users like yourself. You'll be surprised how often you cancel your queries. Secondly, some applications are not written very well. They do select without TOP XX clause but cancel the query after receiving some number of rows. Better filter those out. Either in the event itself or when extracting data.
When analyzing the results I compare duration and CPU time. If they are similar then the query is probably too slow. If the times are like 30 seconds duration but 1 for CPU, then it was probably blocked by other query. You need to use other methods to identify the offending query. I also check spills. They can be slow, so fixing them may speed up the query. Note that spills are only reported for batches for some reason.
Wanted to say thanks - this set of queries was exactly what I needed to troubleshoot some Power BI Report Server issues we were seeing. We started to see timeout messages from some of our report refreshes but aren't seeing abnormal behavior in the actual queries that populate the reports, so we're using this approach and your extended event query to monitor and figure out what's going on.
ReplyDelete