SQL Server AlwaysOn Availability Groups (commonly abbreviated as AGs) was first introduced in SQL Server 2012 Enterprise edition and has further been enhanced with the release of SQL Server 2014. It’s a high availability, disaster recovery, scale-out architecture solution that can yield great results but can be complex to manage.

From my own experience with deploying AGs on different clients I’ve realized that this technology has many moving parts: clustering, the network configuration, the SQL configuration, the DMVs and instrumentation available, etc. All these pieces contribute to how robust and reliable the solution is, so all of them must be given attention.

As part of making Availability Groups deployments easier for myself and to all our readers, I decided to compile this Cheat Sheet with all the references, advice and links that I have found useful for these projects. I’ll be keeping the list alive as well any time I run into some new material that I want to add to this Cheat Sheet.

Choice of Operating System

AGs can be deployed on Windows 2008 all the way to Windows 2012 R2. However, 2008 deployments are NOT recommended, definitely not in this day and age where Windows 2012 R2 is a mature alternative. If anyone needs convincing, you can see the official reference of the issues with hotfixes on 2008 and 2008 R2 that were fixed or don’t apply to 2012+. The list is right here.

Possible Configurations

Synchronous and Asynchronous, similar to Database Mirroring. Sync is inter-datacenter, async for any type of cross DC link.

SQL Server Enterprise Edition required.

SQL Server 2016 Standard will offer AGs with limited functionality.

Windows Clusters cannot span multiple domains, therefore AGs must live in one domain, regardless of geographic boundaries.

2012 Limits: 1 primary replica, 4 secondary replicas, 2 of the replicas can be synchronous.

2014 Limits: 1 primary replica, 8 secondary replicas, 2 of the replicas can be synchronous.

2016 Limits: 1 primary replica, 8 secondary replicas, 3 of the replicas can be synchronous.

MSDTC / distributed transactions between databases on an AG is NOT SUPPORTED on SQL Server 2012 and 2014. See details here.

MSDTC support is coming on SQL Server 2016.

You can mix failover cluster instances with standalone instances as well. See details here.

Catalog Information

The list of catalog views inside SQL Server for Availability Groups information can be found here.

DMV Performance information

This list of Dynamic Management Views info for Availability Groups and Functions can be found here.

Networking

Configure a dedicated NIC for best performance by isolating the AG traffic. Here’s how on this MSDN blog post.

There are DNS settings that should be tweaked for multi-subnet clusters, here’s a great guide on this MSDN blog post.

Gratuitous ARP requests must be forwarded for failover to work seamlessly, see this post for more details.

Listener and read only routing

Though it’s not commonly known, you can have multiple listeners for one availability group and configure them separately. Details here.

For scale-out of read request scenarios you can setup read-only routing of a replica through the listener. Details here.

Backups on AG clusters

You can do Full, file or filegroup copy-only backups or log backups on a SECONDARY. Differential backups are NOT possible.

To set this up transparently you need 3 things:

a) A network location that both nodes can see. This is so all your backups are in one place regardless of what node the backup happened, there’s nothing worse than trying to piece together a restore sequence when the backups are all over the place. Trust me, you’ll thank me when/if an emergency happens and you need to deal with this.

b) You need to set the backup preference setting on the AG.

c) You need to use this setting on your backup scripts to decide where to take the backup. There’s a handy function called sys.fn_hadr_backup_is_preferred_replica that you can use.

This page on Books Online has details on items b and c.

SQL Agent Jobs on AG clusters

Similarly to backups, with the possibility of replicas and read-only you might have some jobs that should run on the Primary or should only run on a Secondary or should only run on a read-only Secondary. The SAP on Microsoft team wrote a handy stored procedure that will look into the current role of the node running the job and then you can set the job to continue or quit depending on the outcome. Details and code on this blog post.

Index Maintenance on AG clusters

The AlwaysON support team recommends:

– Decreasing MAXDOP of index rebuilding to lighten up the load between the AGs.

– Setting up a routine that will only rebuild what is necessary. This is a recommendation for ANY SQL Server. Best option: Ola Hallengren’s awesome scripts.

– Changing the sync mode to asynchronous temporarily while the index maintenance is running to avoid affecting the log on the primary.

– Using partitioning for the larger tables so that partition level maintenance is possible.

– Avoid running heavy queries on the secondary while the index maintenance is happening on the primary.

Full details in this blog post.

Statistics on AG clusters

Since the queries on a secondary can use different column combinations than on the primary, SQL Server will create statistics objects on tempdb if necessary. If the AG fails over then all these temporary statistics are deleted and SQL Server starts using the permanent statistics that live inside the database.

Any maintenance work on stats done on the primary will carry over to the secondary automatically. There’s no command to maintain or refresh the temporary stats on the secondary, you can only drop them if you want.

Full details on the Books Online page.

Security on AG clusters

SQL Server Total Database Encryption (TDE) is supported with Availability Groups. However, there are extra steps on setting this up since you have to have the proper certificates deployed on all the nodes in the cluster. Here’s the step by step to set this up.

You will still need to copy – maintain the logins between the different SQL instances that are part of the AG. For that you can use any of the scripts that do the same job for Database Mirroring, here’s a good one, from the SQL Soldier, Robert Davis.

Performance Information

Availability groups use a thread pool approach (as opposed to all dedicated threads for Database Mirroring) with a max threads of SQL Server’s max worker threads minus 40.

There are also a small amount of unshared threads but they are released back to the general pool if the thread is idle.

Full breakdown of the numbers related to thread usage on this Books Online page.

Here’s a deep dive article on monitoring AG worker thread consumption from the SQL Server Premier Field Engineering blog.

Databases on read-only replicas use row versioning on tempdb for their read-only queries. So you will need a robust tempdb setup that will keep the row versions and you will have the 14-byte overhead on the rows.

Microsoft published this great document that outlines all the different possibilities and things to consider when using read-only replicas.

Other useful AGs knowledge

Monitoring performance of Availability Groups: here.

How to tell when your secondary will be ready to failover: here.

Very nice diagram of how the data replication works on Availability Groups from the SAP on Windows team: here.

If you have any other tip, trick or suggestion to add please let us know in the comments below!