This is another post in a series of posts on creating performant and scalable web APIs using ASP.NET Core. In this post, we’ll allow the consumers of our API to request large collections of data by “paging” the data …

How paging helps performance

Paging helps performance and scalability in a number of ways:

The number of page read I/Os is reduced when SQL Server grabs the data

The amount of data transferred from the database server to the web server is reduced

The amount of memory used to store the data on the web server in our object model is reduced

The amount of data transferred from the web server to the client is reduced

This all adds up to potentially a significant positive impact - particularly for large collections of data.

Benchmark

We’re continuing to use Dapper for our data access. Let’s start with a simple controller action method for

[ HttpGet ] public IActionResult GetContacts ( ) { IEnumerable < Contact > contacts = null ; using ( SqlConnection connection = new SqlConnection ( _connectionString ) ) { connection . Open ( ) ; contacts = connection . Query < Contact > ( @"SELECT ContactId, Title, FirstName, Surname FROM Contact" ) ; } return Ok ( contacts ) ; }

This is what we get if we profile the SQL:

Let’s load test the API as well:

Simple paging implementation

Ok, so that’s our benchmark. Let’s implement paging in our controller action method. We are grabbing the page number and page size from parameters in the URL - we encapsulate these values in a UrlQuery class. We also add a TSQL OFFSET and FETCH in an ORDER BY clause if paging values have been supplied.

[ HttpGet ] public IActionResult GetContacts ( [ FromQuery ] UrlQuery urlQuery ) { IEnumerable < Contact > contacts = null ; using ( SqlConnection connection = new SqlConnection ( _connectionString ) ) { connection . Open ( ) ; string sql = @"SELECT ContactId, Title, FirstName, Surname FROM Contact" ; if ( urlQuery . PageNumber . HasValue ) { sql += @" ORDER BY Contact.ContactPK OFFSET @PageSize * (@PageNumber - 1) ROWS FETCH NEXT @PageSize ROWS ONLY" ; } contacts = connection . Query < Contact > ( sql , urlQuery ) ; } return Ok ( contacts ) ; }

public class UrlQuery { private const int maxPageSize = 100 ; public int ? PageNumber { get ; set ; } private int _pageSize = 50 ; public int PageSize { get { return _pageSize ; } set { _pageSize = ( value < maxPageSize ) ? value : maxPageSize ; } } }

Let’s give it a go …

We get less I/O reads in the profiled the SQL:

If we load test this on the 1st page of data, we see an improved requests per second:

We get a similar result if we go for a page of data in the middle of our dataset:

So, we get a nice improvement on a simple query on a pretty small dataset (500 records). If our query involved joins and the dataset was much larger, the improvement would naturally be much larger.

As well as the performance benefit, a pageable API is a more consumer friendly API. Users of the client application are only likely to be able to consume portions of data at a time. So having a pageable API makes the development task easier for the builder of the client application (they don’t need to do any client side paging).

Including the total record count

Often, we’ll have a requirement to return the number of pages and number of records in the response so that the client application can expose this to the end user. There is a performance penalty though - let’s implement this and see the sort of impact this will have.

Our controller action method is now a little more complex … We include a 2nd SQL statement to get the total record count and use Dappers QueryMultiple method to get the data and count in a single round trip. We’ve also added paging information to the X-Pagination HTTP header.

[ HttpGet ] public IActionResult GetContacts ( UrlQuery urlQuery ) { IEnumerable < Contact > contacts = null ; int ? totalRecords = null ; using ( SqlConnection connection = new SqlConnection ( _connectionString ) ) { connection . Open ( ) ; string sql = @"SELECT ContactId, Title, FirstName, Surname" ; sql += " FROM Contact" ; if ( urlQuery . PageNumber . HasValue ) { sql += @" ORDER BY Contact.ContactPK OFFSET @PageSize * (@PageNumber - 1) ROWS FETCH NEXT @PageSize ROWS ONLY" ; } if ( urlQuery . PageNumber . HasValue && urlQuery . IncludeCount ) { sql += " SELECT [TotalCount] = COUNT(*) FROM Contact" ; } using ( GridReader results = connection . QueryMultiple ( sql , urlQuery ) ) { contacts = results . Read < Contact > ( ) ; if ( urlQuery . PageNumber . HasValue && urlQuery . IncludeCount ) { totalRecords = results . ReadSingle < int > ( ) ; } } } if ( urlQuery . PageNumber . HasValue ) { Pagination pagination = new Pagination ( ) { PageNumber = urlQuery . PageNumber . Value , PageSize = urlQuery . PageSize } ; if ( urlQuery . IncludeCount ) { pagination . TotalRecords = totalRecords . Value ; } Response . Headers . Add ( "X-Pagination" , JsonConvert . SerializeObject ( pagination ) ) ; } return Ok ( contacts ) ; }

public class UrlQuery { private const int maxPageSize = 100 ; public bool IncludeCount { get ; set ; } = false ; public int ? PageNumber { get ; set ; } private int _pageSize = 50 ; public int PageSize { get { return _pageSize ; } set { _pageSize = ( value < maxPageSize ) ? value : maxPageSize ; } } }

public class Pagination { public int PageNumber { get ; set ; } public int PageSize { get ; set ; } public int ? TotalRecords { get ; set ; } public int ? TotalPages => TotalRecords . HasValue ? ( int ) Math . Ceiling ( TotalRecords . Value / ( double ) PageSize ) : ( int ? ) null ; }

Here’s a SQL trace of hitting the endpoint, getting the first page of data, including the count. We can see the I/O reads has increased.

If we do a load test, getting the first page of data, including the count, we can see the requests per second has reduced a bit.

Again, the performance impact of the count would be more significant if SQL query was more complex (e.g. includes a JOIN) and was operating on a larger dataset.

Conclusion

In summary, it’s a good idea to include paging on our GET requests but have the “total count” optional and off by default to give maximum performance.

Comments

Maxim Markelow March 10, 2018

Your example is quite simple. In an enterprise applications one complex problem is security policy. The visibility of the object and the list of allowed actions are determined by the user’s rights, roles, and its belonging to a particular department in the company hierarchy. Of course, this all changes dynamically.

Example. We have 1000 of entities. We made a page request from 100 entities. None of them are available to us due to business rules limitations. But next page probably will contain desired entities.

Should we return an empty list? Or should we try to request next page automaticaly? What is the best practice for such scenarios?

Carl March 11, 2018

f the database is SQL Server 2016+, I’d put the rules on what a user can view and edit in SQL Server Row Level Security (RLS) – that way you don’t need to think about it in the ASP.NET code. I have a blog post on RLS at https://www.carlrippon.com/creating-a-multi-tenant-asp-net-core-web-api-with-sql-server-rls/. Hope this helps.

Ao Guy June 5, 2018

Best practice: don’t try to enforce your business rule in the persistence layer. It smells bad requirements gathering which leads to bad architecture. The author is talking about persistence and persistence only.