I’ve already blogged about my dislike for ORMs from a production DBA performance tuning standpoint only. I get that they’re useful to developers. I get it. But I’m focused on performance.

A quick recap of what I don’t like about ORMs from that other blog post:

Implicit conversions due to nvarchar variables vs varchar columns

Queries grab every single column, regardless if they are needed or not Expensive key lookups Very wide missing index recommendations

The resulting query is hard to read

Though I did mention that they grab every single column, I didn’t mention that this can cause large memory grants which can lead to RESOURCE_SEMAPHORE waits.

RESOURCE_SEMAPHORE waits occur when SQL Server runs out of available memory to run queries.

ORM-style query: Selecting all the columns

Let’s look at an example.

SELECT Posts.Id, Posts.AcceptedAnswerId, Posts.AnswerCount, Posts.Body, Posts.ClosedDate, Posts.CommentCount, Posts.CommunityOwnedDate, Posts.CreationDate, Posts.FavoriteCount, Posts.LastActivityDate, Posts.LastEditDate, Posts.LastEditorDisplayName, Posts.LastEditorUserId, Posts.OwnerUserId, Posts.ParentId, Posts.PostTypeId, Posts.Score, Posts.Tags, Posts.Title, Posts.ViewCount, Users.DisplayName, Users.Location, Users.AboutMe, Comments.Id, Comments.CreationDate, Comments.PostId, Comments.Score, Comments.Text, Comments.UserId FROM dbo.Posts JOIN dbo.PostLinks ON Posts.Id = PostLinks.PostId JOIN dbo.Users ON Posts.OwnerUserId = Users.Id LEFT JOIN dbo.Comments ON Posts.Id = Comments.PostId WHERE Posts.CreationDate >= '2012-01-01' AND Posts.CreationDate < '2012-04-01' AND Users.Reputation >= 20000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 SELECT Posts . Id , Posts . AcceptedAnswerId , Posts . AnswerCount , Posts . Body , Posts . ClosedDate , Posts . CommentCount , Posts . CommunityOwnedDate , Posts . CreationDate , Posts . FavoriteCount , Posts . LastActivityDate , Posts . LastEditDate , Posts . LastEditorDisplayName , Posts . LastEditorUserId , Posts . OwnerUserId , Posts . ParentId , Posts . PostTypeId , Posts . Score , Posts . Tags , Posts . Title , Posts . ViewCount , Users . DisplayName , Users . Location , Users . AboutMe , Comments . Id , Comments . CreationDate , Comments . PostId , Comments . Score , Comments . Text , Comments . UserId FROM dbo . Posts JOIN dbo . PostLinks ON Posts . Id = PostLinks . PostId JOIN dbo . Users ON Posts . OwnerUserId = Users . Id LEFT JOIN dbo . Comments ON Posts . Id = Comments . PostId WHERE Posts . CreationDate >= '2012-01-01' AND Posts . CreationDate < '2012-04-01' AND Users . Reputation >= 20000

This query grabs all columns from Posts, PostLinks, Users and Comments. In the normal Stack Overflow database without any nonclustered indexes, this query wants an almost 10GB memory grant just to run.

That’s not fair, though – your database indexes are surely very finely tuned – so we’ll give SQL Server a fighting chance with a few helpful indexes to reduce its workload.

CREATE INDEX ix_CreationDate_OwnerUserId ON dbo.Posts (CreationDate, OwnerUserId); CREATE INDEX ix_PostId ON dbo.PostLinks (PostId); CREATE INDEX ix_PostId ON dbo.Comments (PostId); CREATE INDEX ix_Reputation ON dbo.Users (Reputation); 1 2 3 4 CREATE INDEX ix_CreationDate_OwnerUserId ON dbo . Posts ( CreationDate , OwnerUserId ) ; CREATE INDEX ix_PostId ON dbo . PostLinks ( PostId ) ; CREATE INDEX ix_PostId ON dbo . Comments ( PostId ) ; CREATE INDEX ix_Reputation ON dbo . Users ( Reputation ) ;

However, even with supporting indexes, the query still gets a 325MB memory grant on my server. I’ve certainly seen worse, but it’s enough to simulate the issue. (Brent says: note that the query doesn’t even have an ORDER BY in it, which is my favorite way to get giant memory grants – it would be even worse!)

Nobody notices a 325MB memory grant when it runs by itself – so to show concurrency, I ran the query through SQLQueryStress using 10 iterations and 50 threads.

After it had been running for a few seconds, I checked sp_BlitzFirst.

EXEC sp_BlitzFirst @ExpertMode = 1, @Seconds = 30 1 EXEC sp_BlitzFirst @ ExpertMode = 1 , @ Seconds = 30

Look at all those RESOURCE_SEMAPHORE waits.

Some queries are able to run, but many are waiting on memory to become available so that they can START.

This was a 30-second sample, but it too was struggling to run so we got a smaller sample in the Wait Stats section of the sp_BlitzFirst output.

Notice that the average waiting time on RESOURCE_SEMAPHORE is 8784.2 milliseconds. That’s 8.7 seconds. That’s a long time to wait to even start running the query.

Minus the Posts.Body column

Let’s look at the same query but without the Posts.Body column, which is nvarchar(max).

SELECT Posts.Id, Posts.AcceptedAnswerId, Posts.AnswerCount, Posts.ClosedDate, Posts.CommentCount, Posts.CommunityOwnedDate, Posts.CreationDate, Posts.FavoriteCount, Posts.LastActivityDate, Posts.LastEditDate, Posts.LastEditorDisplayName, Posts.LastEditorUserId, Posts.OwnerUserId, Posts.ParentId, Posts.PostTypeId, Posts.Score, Posts.Tags, Posts.Title, Posts.ViewCount, Users.DisplayName, Users.Location, Users.AboutMe, Comments.Id, Comments.CreationDate, Comments.PostId, Comments.Score, Comments.Text, Comments.UserId FROM dbo.Posts JOIN dbo.PostLinks ON Posts.Id = PostLinks.PostId JOIN dbo.Users ON Posts.OwnerUserId = Users.Id LEFT JOIN dbo.Comments ON Posts.Id = Comments.PostId WHERE Posts.CreationDate >= '2012-01-01' AND Posts.CreationDate < '2012-04-01' AND Users.Reputation >= 20000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 SELECT Posts . Id , Posts . AcceptedAnswerId , Posts . AnswerCount , Posts . ClosedDate , Posts . CommentCount , Posts . CommunityOwnedDate , Posts . CreationDate , Posts . FavoriteCount , Posts . LastActivityDate , Posts . LastEditDate , Posts . LastEditorDisplayName , Posts . LastEditorUserId , Posts . OwnerUserId , Posts . ParentId , Posts . PostTypeId , Posts . Score , Posts . Tags , Posts . Title , Posts . ViewCount , Users . DisplayName , Users . Location , Users . AboutMe , Comments . Id , Comments . CreationDate , Comments . PostId , Comments . Score , Comments . Text , Comments . UserId FROM dbo . Posts JOIN dbo . PostLinks ON Posts . Id = PostLinks . PostId JOIN dbo . Users ON Posts . OwnerUserId = Users . Id LEFT JOIN dbo . Comments ON Posts . Id = Comments . PostId WHERE Posts . CreationDate >= '2012-01-01' AND Posts . CreationDate < '2012-04-01' AND Users . Reputation >= 20000

This one gets a 234MB memory grant. That’s 91MB less than the query that includes Posts.Body.

Repeat the test minus the Posts.Body column.

EXEC sp_BlitzFirst @ExpertMode = 1, @Seconds = 30 1 EXEC sp_BlitzFirst @ ExpertMode = 1 , @ Seconds = 30

Still seeing RESOURCE_SEMAPHORE waits, but the average waiting time has dropped to 4.5 seconds. Queries don’t have to wait as long in order to get the memory they need to run.

Minus all big columns

Now let’s take a look at the query without the big columns: Posts.Body, Users.AboutMe and Comments.Text columns. Posts.Body and Users.AboutMe are nvarchar(max); Comments.Text is nvarchar(700).

SELECT Posts.Id, Posts.AcceptedAnswerId, Posts.AnswerCount, Posts.ClosedDate, Posts.CommentCount, Posts.CommunityOwnedDate, Posts.CreationDate, Posts.FavoriteCount, Posts.LastActivityDate, Posts.LastEditDate, Posts.LastEditorDisplayName, Posts.LastEditorUserId, Posts.OwnerUserId, Posts.ParentId, Posts.PostTypeId, Posts.Score, Posts.Tags, Posts.Title, Posts.ViewCount, Users.DisplayName, Users.Location, Comments.Id, Comments.CreationDate, Comments.PostId, Comments.Score, Comments.UserId FROM dbo.Posts JOIN dbo.PostLinks ON Posts.Id = PostLinks.PostId JOIN dbo.Users ON Posts.OwnerUserId = Users.Id LEFT JOIN dbo.Comments ON Posts.Id = Comments.PostId WHERE Posts.CreationDate >= '2012-01-01' AND Posts.CreationDate < '2012-04-01' AND Users.Reputation >= 20000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 SELECT Posts . Id , Posts . AcceptedAnswerId , Posts . AnswerCount , Posts . ClosedDate , Posts . CommentCount , Posts . CommunityOwnedDate , Posts . CreationDate , Posts . FavoriteCount , Posts . LastActivityDate , Posts . LastEditDate , Posts . LastEditorDisplayName , Posts . LastEditorUserId , Posts . OwnerUserId , Posts . ParentId , Posts . PostTypeId , Posts . Score , Posts . Tags , Posts . Title , Posts . ViewCount , Users . DisplayName , Users . Location , Comments . Id , Comments . CreationDate , Comments . PostId , Comments . Score , Comments . UserId FROM dbo . Posts JOIN dbo . PostLinks ON Posts . Id = PostLinks . PostId JOIN dbo . Users ON Posts . OwnerUserId = Users . Id LEFT JOIN dbo . Comments ON Posts . Id = Comments . PostId WHERE Posts . CreationDate >= '2012-01-01' AND Posts . CreationDate < '2012-04-01' AND Users . Reputation >= 20000

This one gets just a 36MB memory grant.

Repeat the test minus the Posts.Body, Users.AboutMe and Comments.Text columns.

EXEC sp_BlitzFirst @ExpertMode = 1, @Seconds = 30 1 EXEC sp_BlitzFirst @ ExpertMode = 1 , @ Seconds = 30

No more RESOURCE_SEMAPHORE waits! Queries no longer have to wait to get the memory they need to run – which also means more memory available for caching data pages and execution plans.

It’s not all good news though. Notice that it says 88 for “Seconds Sample”. I specified 30 seconds for sp_BlitzFirst. I now had a CPU bottleneck. Even sp_BlitzFirst was suffering.

Sometimes when we resolve a bottleneck, it moves the bottleneck to another area. If this were a real-world scenario, I would figure out if there were any other columns that weren’t needed to be returned and then add covering indexes as there are expensive key lookups in the execution plan.

What if I need to return big columns?

Sometimes you need to return really big columns. That’s okay. Return those columns ONLY when you need to. Don’t let your ORM return every single column in each of the tables in the query. Spend the time to return only the columns that are needed for the specific query. Richie shows how to do this in Entity Framework.

Recap

Let’s add RESOURCE_SEMAPHORE waits to the list of why I don’t like ORMs from a production DBA performance tuning standpoint:

Implicit conversions due to nvarchar variables vs varchar columns

Queries grab every single column, regardless if they are needed or not Expensive key lookups Very wide missing index recommendations Leads to RESOURCE_SEMAPHORE waits

The resulting query is hard to read

In the wild

Have you seen RESOURCE_SEMAPHORE waits in the wild? Were they due to big columns or something else? How did you resolve it?