AlwaysOn Availability Groups and SQL Server Jobs, Part 5: Setting Up Failover Alerts
Alerts represent a great way to take a proactive stance on being notified or alerted when something ugly, potentially ugly, or important happens on a SQL Server. On most servers that I manage, I typically deploy a whole bevy of different alerts.
September 9, 2014
In my previous post where I addressed considerations for what needs to be synchronized at the server (or instance) level (I mentioned the need to keep alerts synchronized across servers. I also intimated that, in my experience, SQL Server's ability to send alerts via the SQL Server Agent, are a very under-used feature.
Part 1: Introduction
Part 2: Putting AlwaysOn into Context
Part 3: Defining Batch Jobs
Part 4: Synchronizing Server-Level Details
In my mind, Alerts represent a great way to take a proactive stance on being notified or alerted when something ugly, potentially ugly, or important happens on a SQL Server. In fact, if I had any complaint about Alerts within the SQL Server Agent, it would be that while they're already pretty robust and powerful it's that they're a bit too heavily tied to WMI in some cases and that I'd like to see them much more heavily re-invented and re-energized in future versions of SQL Server and tied directly into Extended Events and/or be made more easily queryable or manageable via PowerShell.
But, I digress.
On most servers that I manage, I typically deploy a whole bevy of different alerts including (but not limited to) the following:
Severity 17-25 Alerts. Within the SQL Server Agent you can set up Alerts or specialized triggers/responses to specific Error Codes encountered by the SQL Server Engine (NOTE: while you CAN set up an alert for ANY SQL Server Error code (i.e., any error thrown by the SQL Server Engine), alerts will _NOT_ trigger unless the error code/message for this particular error is set to be LOGGED—see A Better Way to Enable Email Alerts for Deadlocks for more details). Another thing you can also do is set up Alerts for errors within a certain severity. Personally, I don't bother with Severity 15 or below—ever. On SOME servers I'll set up alerts for Severity 16 and above—depending upon how many false positives there are being thrown by developer code via THROW/RAISEERROR—as I don't ever want to get used to the idea of alerts showing up regularly in my inbox. On most servers, though, I highly recommend setting up alerts for Error Severities 17–25—as errors in these categories are almost always (with the occasional/semi-regular exception caused by client-connectivity hiccups due to Kerberos and SSPI contexts) something that should demand DBA attention or notification. To add these, you can simply expand the SQL Server Agent > Alerts node, right click and click on New Alert. This'll let you then specify a name you’d like to use to identify this alert, and then you can specify whether you want to alert on a particular/specific error number (I wish we could use/define ranges or multiples) OR whether you want to Alert on a specific Severity—say, 17, for example. Then, on the Response pane/tab you can define what you'd like to have happen in response when this error is encountered. Sending an email is the obvious choice (but it requires that you have correctly configured SQL Server's Database Mail and configured the SQL Server Agent to be able to USE this capability for alerting) but there's also the option to actually cause these alerts to run or execute jobs as well. The tedious part, though, is that you’ll need to specify or create a different Alert for each Severity—i.e., one for Severity 17 errors, one for Severity 18 errors, and so on—all the way up to 25. To that end, I've dropped a script into this post (down below) that you can use to set up Severity 17 – 25 alerts by simply replacing the parameters (with the name of the Operator) you wish to notify so that you can spin these up in a matter of a few seconds if you haven't already done so.
I/O and Storage Problem Alerts. Regular checks for corruption are a key concern that all DBAs should be addressing. But an additional and KEY component of dealing with corruption is early detection. As such, setting up alerts for I/O errors or problems is something that should be configured on EVERY SQL Server. (For more info, see my multi-part series on SQL Server Database Corruption.)
Replication Alerts. If you're using replication (which can be VERY tricky with AlwaysOn Availability Groups) you'll obviously want to make sure that any alerts you've defined to preemptively warn you of problems or issues are synchronized across servers.
Deadlocks. I do NOT set these up on every server—but in situations where the server is prone to deadlocks, setting these up can make a lot of sense.
Long Running Transactions. Somewhat similar to the above, I don't set these up on every server—but will set up alerts for long-running transactions in scenarios where I think there may be problems or where there are complaints of slow-downs and other locking/blocking issues that haven't already been resolved. Note too that these alerts are set up as JOBs and not actual SQL Server Agent Alerts. (I'm guessing that if I wanted to wrestle with WMI internals I could probably figure out a similar way to do this directly with SQL Server Agent Alerts. I just haven't ever had the time—or, honestly, the inclination—to really tackle that).
Disk space alerts. Also not something that's needed on every server (see Disk Space Monitoring: How To)—and also something that isn't implemented directly via SQL Server Agent Alerts (guessing there MIGHT be a way via WMI). But, as with the other alerts listed here, this is something that IF you determine you need it on your primary server, you'll WANT to make sure you synchronize to the other servers you'll be using as hosts for AlwaysOn Availability Groups. Otherwise, failover on to one of these other servers could mean that any of the alerts you've set up to pro-actively tell you about potential problems or issues would be lost.
AlwaysOn Availability Group Failover and Data Flow Alerts
Given that pro-actively being warned of important events or potential problems is a best practice, it's probably not much of a surprise that every organization that I've worked with to help setup AlwaysOn Availability Groups has wanted to be notified of WHEN there's a failover and when their workloads shift or shunt from one server to another. Likewise, being able to keep track on data flow rates and the overall health of the Servers and Availability Group is another thing that smart organizations want to keep an eye on (and which I'll address in a later post).
Happily, though, being alerted to issues with failover and/or data-flow problems with SQL Server Availability Groups is very trivial (sadly, this was NOT really the case with Database Mirroring). To pull this off all you need to do is set up 3 specific alerts—for Errors 1480, 35264, and 35265. As outlined above, you can create each one of these manually by expanding the SQL Server Agent > Alerts node, right-clicking and selecting the New Alert menu option, then dropping in details like the following:
And, then specifying which Operator to email on the Response tab:
And, then just repeating that process three times—i.e., one time for each error until you end up with something that looks similar to the following:
In my experience, though, an easier way to set up alerts is simply to execute a script. To that end, I've included the script I use to deploy these alerts on Servers when setting up AlwaysOn Availability Groups. To use this script, just replace the parameters (follow the link to the free SQL Server Video that outlines how these work if you’re not familiar) and then execute the script and you'll end up with 3 new Alerts added and bound to the Operator name that you specified:
/*INSTRUCTIONS:Replace Parameters (CTRL+SHIFT+M or Query > Specify Values for Templates)For more information on parameters (i.e., ) see the following:http://www.sqlservervideos.com/video/sql-server-2008-t-sql-enhancements/ PARAMETERS: - Name of the Operator/Alias to alert. - Bitmap of notification types/options: 1 = email, 2 = pager, 4 = netsend*/-- 1480 - AG Role Change (failover)EXEC msdb.dbo.sp_add_alert@name = N'AG Role Change',@message_id = 1480, @severity = 0, @enabled = 1, @delay_between_responses = 0, @include_event_description_in = 1;GOEXEC msdb.dbo.sp_add_notification @alert_name = N'AG Role Change', @operator_name = N'', @notification_method = ; GO-- 35264 - AG Data Movement - ResumedEXEC msdb.dbo.sp_add_alert@name = N'AG Data Movement - Suspended',@message_id = 35264, @severity = 0, @enabled = 1, @delay_between_responses = 0, @include_event_description_in = 1;GOEXEC msdb.dbo.sp_add_notification @alert_name = N'AG Data Movement - Suspended', @operator_name = N'', @notification_method = ; GO-- 35265 - AG Data Movement - ResumedEXEC msdb.dbo.sp_add_alert@name = N'AG Data Movement - Resumed',@message_id = 35265, @severity = 0, @enabled = 1, @delay_between_responses = 0, @include_event_description_in = 1;GOEXEC msdb.dbo.sp_add_notification @alert_name = N'AG Data Movement - Resumed', @operator_name = N'', @notification_method = ; GO
Bonus Script
And, for anyone who hasn't already set up Alerts for Severities 17–25, here's a bonus script that you can use to set those up—using the same process.
/*INSTRUCTIONS:Replace Parameters (CTRL+SHIFT+M or Query > Specify Values for Templates)For more information on parameters (i.e., ) see the following:http://www.sqlservervideos.com/video/sql-server-2008-t-sql-enhancements/ PARAMETERS: - Name of the Operator/Alias to alert. - Bitmap of notification types/options: 1 = email, 2 = pager, 4 = netsend*/EXEC msdb.dbo.sp_add_alert @name=N'Severity 017',@message_id=0,@severity=17,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 017', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 018',@message_id=0,@severity=18,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 018', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 019',@message_id=0,@severity=19,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 019', @operator_name=N'',@notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 020',@message_id=0,@severity=20,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 020', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 021',@message_id=0,@severity=21,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 021', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 022',@message_id=0,@severity=22,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 022', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 023',@message_id=0,@severity=23,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 023', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 024',@message_id=0,@severity=24,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 024', @operator_name=N'', @notification_method = ;GOEXEC msdb.dbo.sp_add_alert @name=N'Severity 025',@message_id=0,@severity=25,@enabled=1,@delay_between_responses=60,@include_event_description_in=1;GOEXEC msdb.dbo.sp_add_notification @alert_name=N'Severity 025', @operator_name=N'', @notification_method = ;GO
About the Author
You May Also Like