I’m a really big fan of Scott Hanselman and the way he manages his personal/work time. While watching his “soft” talk titled “Scaling Yourself” I was surprised when he described his way of answering some questions coming from all over the world (via email). He said that each of us has a limited number of keystrokes until we die so, it’s such a waste to write a dedicated email with long answer and examples in the code. Why? Because we could optimize that process and share our knowledge with other folks out there. As you probably guess the best way is to simply… write a blog post! And today’s one is going to be exactly created for that purpose. A week ago I was asked about using Include method while using Entity Framework? Is that related to lazy loading? Should we use it every time when working with IQueryable? If you’re not sure, just keep reading 😉

The example

Before we move forward to the Include method, let me introduce a simple EF example with two entities (one-to-many relation) and their Database context. The code below presents the following:

Context

public class BookStoreContext : DbContext { public BookStoreContext() : base("BookStore") { Database.SetInitializer(new DropCreateDatabaseIfModelChanges<BookStoreContext>()); } public virtual DbSet<Author> Authors { get; set; } public virtual DbSet<Book> Books { get; set; } }

Author entity

public class Author { [Key] public int Id { get; set; } public string FirstName { get; set; } public string LastName { get; set; } public virtual ICollection<Book> Books { get; set; } }

Book entity

public class Book { public int Id { get; set; } public string Title { get; set; } public int AuthorId { get; set; } [ForeignKey(nameof(AuthorId))] public virtual Author Author { get; set; } }

And the program which will help us understanding the problem:



class Program { static void Main(string[] args) { using (var context = new BookStoreContext()) { var author = context.Authors .FirstOrDefault(); foreach (var book in author.Books) Console.WriteLine(book.Title); } Console.ReadLine(); } }

Using Include with Lazy Loading

Before running the program let’s refresh our knowledge about lazy loading supported by Entity Framework. The idea that stands behind is really simple. We can mark specific navigation properties or even whole entities as “lazy” by making them virtual. It’s also required to enable the proxy creation and lazy loading inside the DbContext object but it’s set by default. Now the whole process is nothing more than making round trips to the database for data if needed (of course in runtime). What does it mean? Let’s analyze the Program class. All we do is getting some author from the DB (some because we did not specify any predicate in the FirstOrDefault method and we also didn’t use Where statement before). After getting the author object we iterate over its books to print the titles, but wait… we didn’t ask for any books in the query! And that’s where lazy loading comes into play. As wrote before, in this case, EF will do a second query to the DB for the books of the specific author and all magic will happen in runtime. The listings below presents the queries generated by EF starting with the Author query:

SELECT TOP (1) .[Id] AS [Id], .[FirstName] AS [FirstName], .[LastName] AS [LastName] FROM [dbo].[Authors] AS

and the second query generated for specific books:

exec sp_executesql N'SELECT [Extent1].[Id] AS [Id], [Extent1].[Title] AS [Title], [Extent1].[AuthorId] AS [AuthorId] FROM [dbo].[Books] AS [Extent1] WHERE [Extent1].[AuthorId] = @EntityKeyValue1',N'@EntityKeyValue1 int',@EntityKeyValue1=1

Now, some of you might think that it’s a great feature which can save us from NullReferenceException. That’s true. Lazy loading is a great mechanism but only if you know when and how to use it. But look at our example again. Did we really have to fetch missing data in two parts? Of course not, since the code that needed books is placed right after the first query. In this case, it would be more efficient to create one query with needed data using SQL JOIN clause, right? And here comes the Include, let’s modify our Program class:

class Program { static void Main(string[] args) { using (var context = new BookStoreContext()) { var author = context.Authors .Include(a => a.Books) .FirstOrDefault(); foreach (var book in author.Books) Console.WriteLine(book.Title); } Console.ReadLine(); } }

In this case, we tell EF “explicitly” that besides author we also need books. The listing below presents the generated query:

SELECT [Project1].[Id] AS [Id], [Project1].[FirstName] AS [FirstName], [Project1].[LastName] AS [LastName], [Project1].[C1] AS [C1], [Project1].[Id1] AS [Id1], [Project1].[Title] AS [Title], [Project1].[AuthorId] AS [AuthorId] FROM ( SELECT [Limit1].[Id] AS [Id], [Limit1].[FirstName] AS [FirstName], [Limit1].[LastName] AS [LastName], [Extent2].[Id] AS [Id1], [Extent2].[Title] AS [Title], [Extent2].[AuthorId] AS [AuthorId], CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1] FROM (SELECT TOP (1) .[Id] AS [Id], .[FirstName] AS [FirstName], .[LastName] AS [LastName] FROM [dbo].[Authors] AS ) AS [Limit1] LEFT OUTER JOIN [dbo].[Books] AS [Extent2] ON [Limit1].[Id] = [Extent2].[AuthorId] ) AS [Project1] ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC

As you can see Entity Framework used LEFT OUTER JOIN clause to get all needed data. It’s also worth to mention that using Include method in the context which supports lazy loading can prevent appearing n+1 problem. If you have never heard about it, stop reading this article for a moment and read this one. One day you’ll thank me 😉

Using Include with Eager Loading

Entity Framework also supports another type of fetching data – eager loading. This one is safer for your performance (if you have no idea what you’re doing) but might cause exceptions in a runtime. Why? Let’s change our example to use the eager loading instead. In order to do that, we need to remove all virtual properties and we should disable the option in the context object. Now, let’s run the program:

What happened? That’s how the eager loading works. We informed EF that the only entity we need is Author which meant the books collection was equal to null (just like in the first example). But here we can’t just do another round trip to the database. You forgot something? You have to deal with NullReferenceException! How can we deal with that? I guess you know that (if not just look at the last C# listing 😉 ). Generated SQL is also identical to the previous. Yet another time SQL JOIN clause saved us (but this time from the exception, not the inefficiency).

Making JOIN without Include method

Some of you might think that EF is kinda weird ORM if it requires using Include method for each JOIN. But, don’t worry there is another way of dealing with that. Frankly, I would say that the following solution should be your default way of dealing with IQuerable. Let’s modify our Program class one, last time:

class Program { static void Main(string[] args) { using (var context = new BookStoreContext()) { var author = context.Authors .Select(a => new AuthorReadModel { FirstName = a.FirstName, LastName = a.LastName, BookTitles = a.Books .AsQueryable() .Select(b => b.Title) .ToList() }) .FirstOrDefault(); foreach (var title in author.BookTitles) Console.WriteLine(title); } Console.ReadLine(); } }

In this case, we created a dedicated ReadModel for all needed data, and we specified all “mappings” inside the Select method. This one tells EF exactly what we need, so it does not need yet another help as Include method. Let’s run the following code and inspect the query:

SELECT [Project1].[Id] AS [Id], [Project1].[FirstName] AS [FirstName], [Project1].[LastName] AS [LastName], [Project1].[C1] AS [C1], [Project1].[Title] AS [Title] FROM ( SELECT [Limit1].[Id] AS [Id], [Limit1].[FirstName] AS [FirstName], [Limit1].[LastName] AS [LastName], [Extent2].[Title] AS [Title], CASE WHEN ([Extent2].[AuthorId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1] FROM (SELECT TOP (1) .[Id] AS [Id], .[FirstName] AS [FirstName], .[LastName] AS [LastName] FROM [dbo].[Authors] AS ) AS [Limit1] LEFT OUTER JOIN [dbo].[Books] AS [Extent2] ON [Limit1].[Id] = [Extent2].[AuthorId] ) AS [Project1] ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC

As I wrote, I strongly recommend using this approach. Not only because you have dedicated, strongly typed data structure but also because ORM does not generate “SELECT * …” query which in most cases is nothing more than waste.