There are many articles online describing SQL Server data compression. In this post I will summarize some of the salient points.

Feature Image / License

What is SQL Server Data Compression

SQL Server data compression has been around since SQL Server 2008. To give perspective, SQL Server 2008 and 2008 R2 reach end of life for extended support this July.

Basically we can compress tables and indexes i.e. the data structures in an RDBMS.

Here is the prosaic KB from Microsoft about SQL Server data compression.

Types of Compression

There are 2 different kinds of SQL Server data compression:

Row Level Compression

Page Level Compression

Row Level Compression

Row level compression is the more primitive type of compression. It is superceded by page level compression. This means we cannot have page level compression without already having row compression.

It looks to compress string data types e.g. varchar, char, nvarchar, nchar, text, etc. It can ignore zeros and null marks as well.



Page Level Compression

Page level compression builds off of row level compression and adds additional compression. This will save us more space than row level compression. Further explanation of the compression can be found here.



Before Compressing

A basic tenant of performance tweaking is to establish a baseline and evaluate changes over time. It is no different here – we want to see how much compression saved us.

Do not attempt to compress system tables!

There are some system views and DMVs that can help:

sys.master_files – this shows us each file in the database instance along with some details.

sys.database_files – a database level view of the database files on disk.

sys.dm_os_volume_stats – shows us basic info about the OS volume where the database files are stored.

Additionally you might want to consider testing some queries with the IO statistics turned on – giving you a measurement you can judge progress from.

SET STATISTICS IO ON SELECT * FROM Purchasing.PurchaseOrderDetail where OrderQty > 500; SET STATISTICS IO OFF

How Much Space Can We Save?



Fortunately this is simple to see. Microsoft includes a system procedure called sp_estimate_data_compression_savings. It returns the size of the object we ask about and estimates how much space can be saved by the type of compresssion we are considering to apply.

We can call this like:

exec sp_estimate_data_compression_savings @schema, @object, @index, @partition, @typeofcompression;

Some examples of calling it:

Pay attention to the column “size_with_requested_compression_setting_(KB)”. This is what we are looking for.

Why Compress?

There is always a bottleneck to any process. Some resource necessary for the computation will lag behind another. Typically the hierarchy looks like this:

CPU is orders of magnitude faster than disk!

Like encryption, the brunt of the work moves from the disk to the CPU. Since CPU is orders of magnitude faster than disk IOPS we realize a performance gain.

Enabling SQL Server Data Compression

Although this can be done in the GUI (SSMS) I prefer to use T-SQL. It allows for more granular control and the ability to automate later.

Microsoft reference KB about enabling compression

Here is an example of how to enable data compression on an object:

ALTER TABLE [Purchasing].[PurchaseOrderDetail] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE);

Similar logic can be used for compressing an index.

Using iteration (while loops or cursor) we can iterate through database objects and compress / decompress in large numbers at a time.

Which Tables and Indexes are Compressed?

We can see what tables and indexes are compressed or not by using simple system catalog views:

SELECT DISTINCT s.[name] ,t.[name] ,i.[name] ,i.[type] ,i.index_id ,p.partition_number ,p.[rows] FROM sys.tables t LEFT OUTER JOIN sys.indexes i ON t.object_id = i.object_id INNER JOIN sys.schemas s ON t.schema_id = s.schema_id LEFT OUTER JOIN sys.partitions p ON i.index_id = p.index_id AND t.object_id = p.object_id WHERE t.type = 'U' AND p.data_compression_desc = 'NONE' ORDER BY p.rows desc

Disabling SQL Server Data Compression

Like the above example we follow similar logic:

ALTER TABLE [Purchasing].[PurchaseOrderDetail] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);

Here is the Microsoft KB explaining disabling compression.

By using some looping or batches with dynamic SQL we can do something like this snippet:

SELECT cs_obj_name = CHECKSUM(d.OBJ_NAME), * INTO #comp_tbls FROM( SELECT DISTINCT OBJ_NAME = OBJECT_NAME(i.[object_id]), i.[object_id], CASE WHEN i.[type]=0 /* Heap */ THEN ' ALTER TABLE '+ QUOTENAME(OBJECT_NAME(i.[object_id])) +' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);' /* Index */ ELSE ' ALTER INDEX '+ QUOTENAME(i.name)+' ON '+QUOTENAME(OBJECT_NAME(i.[object_id]))+' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = NONE);' END + CHAR(13) AS cmd FROM sys.partitions prt WITH(NOLOCK) INNER JOIN sys.indexes i WITH(NOLOCK) ON i.[object_id] = prt.[object_id] and prt.index_id = i.index_id --can also join a table with what you want to compress/decompress WHERE prt.[data_compression] <> 0

After Compression

Ok now we have compressed objects and see performance gains. That’s great! But what about the disk space that was freed up? Can we reclaim it?

Yes – shrinking the database will deallocate the disk space to the object and return it to the OS.

Shrinking databases should be used with caution! This is a legitimate example of when to use it.

What We Didn’t Cover

There is a lot to cover for SQL Server data compression. I did not attempt to cover it all but rather provide a crash course.

Some of the things we omitted:

Columnstore compression

COMPRESS function in SQL Server 2017

Backup compression

Replication and compression

If you liked this post then you might also like: SQL Server on VMware Best Practices – How to Optimize the Architecture

Did you find this helpful? Please subscribe!