In our last post, we ran a query with an ORDER BY, but we only got one column in the SELECT:

SELECT Id FROM dbo.Users WHERE LastAccessDate > '2014/07/01' ORDER BY LastAccessDate; 1 2 3 4 SELECT Id FROM dbo . Users WHERE LastAccessDate > '2014/07/01' ORDER BY LastAccessDate ;

The estimated cost was about $18 Query Bucks because SQL Server had to:

Scan the entire clustered index, yelling out the Id and LastAccessDate of each row

Sort that list by LastAccessDate

Now, let’s change just one thing about the query – what we’re selecting:

SELECT * FROM dbo.Users WHERE LastAccessDate > '2014/07/01' ORDER BY LastAccessDate; 1 2 3 4 SELECT * FROM dbo . Users WHERE LastAccessDate > '2014/07/01' ORDER BY LastAccessDate ;

And run both of the queries back to back to get their actual execution plans:

The basic execution plan of both queries is pretty similar, aside from the fact that the bottom one went parallel. SQL Server realized that sorting all this data was going to be a heck of a lot more work, so it split the work across multiple CPU cores.

This query sucks in a few different ways.

SELECT * can read more data. I know what you’re thinking: both queries have to read all of the 8KB pages in the table, right? No – go back to the first post in the series when I introduced the Users table. I mentioned that the AboutMe field, a big ol’ NVARCHAR(MAX), might be so large that it ends up getting pushed off-row: stored in different 8KB pages. Our SELECT Id query didn’t need to read those extra pages – but when we SELECT *, we do.

SELECT * can sort more data. SQL Server can’t just sort the LastAccessDates – it sorts the whole rows around. That means it’s going to need more CPU time, more memory, and do more spills to disk if it gets those memory estimates wrong.

SELECT * can take more time to output. This was always the thing I focused on as a database administrator. I would tell my devs, “Don’t select fields you don’t need, because it takes longer to yell that data out over the network.” Today, that’s the least of my problems: I’m much, much more concerned about the CPU time and memory grants consumed by the sort operator.

Your first clue about just how bad the SELECT *’s sort operator sucks is the 97% cost of the sort – but it’s not 97% of the same query cost. The original query cost was $18 Query Bucks, but check out the cost of the SELECT * – IT’S ALMOST NINE HUNDRED QUERY BUCKS.

If query bucks were real dollars, I could have bought my first car with the cost of this SELECT *.

I hate SELECT *.

It isn’t about the star itself – it’s about lazily getting all of the columns, including ones you don’t need, and then forcing SQL Server to sort them.

And just because you didn’t put an ORDER BY in your query doesn’t mean you don’t have a sort, either – Erik wrote about sorts that get injected into your plan even if you didn’t ask for ’em.

For the next post in the series, we’ll go back to selecting just the Id. It’s not that I’m only going to allow my developers to just select Ids and nothing else – I understand that we gotta get data out of the database. However, when I’m dealing with query tuning, and I see a SELECT * (or a big ginormous list of columns), I’m going to start the tuning process by asking if we really need all of those fields. Even with Entity Framework, you can pick the columns you want. Does it take a little more work on your part? Sure – but you’re a lot cheaper than SQL Server’s CPU licensing or Azure SQL DB’s pricing, especially as your app starts to scale.

So next, let’s get just the Id, but run the query repeatedly.