Frontend performance is the easiest to achieve in our opinion because there is already great tooling and a bunch of best practices that are easy to follow. But still a lot of websites do not follow these best practices and do not optimize their frontend at all.

Network performance is the most important factor for page load times and also the hardest to optimize. Caching and CDNs are the most effective ways of optimization but come with a considerable effort even for static content.

Backend performance is dependent on single-server performance and your ability to distribute work across machines. Horizontal scalability is particularly difficult to implement and has to be considered right from the start. A lot of projects treat scalability and performance as an afterthought and come into big trouble when their business grows.

Literature and Tool Recommendations

There are great books on the topic of web performance and scalable systems’ design. High Performance Browser Networking by Ilya Grigorik contains almost everything you need to know about networking and browser performance, plus the constantly updated version is free to read online! Desining Data-Intensive Applications by Martin Kleppmann is still in early release state but already among the best book in its field. It covers most fundamentals behind scalable backend systems with great detail. Designing for Performance by Lara Callender Hogan is all about building fast websites with great user experience covering a lot of best practices.

There are also great online guides, tutorials and tools to consider. From the beginner-friendly Udacity course Website Performance Optimization over Google’s developer performance guide to profiling tools like Google PageSpeed Insights, GTmetrix and WebPageTest.

Newest Developments In Web Performance

Accelerated Mobile Pages

Google is doing a lot to increase awareness for web performance with projects such as PageSpeed Insights, developer guides for web performance and including page speed as a leading factor of their page rank.

Their newest concept to improve page speed and user experiance in Google search is called accelerated mobile pages (AMP). The idea is to have news articles, product pages and other search content load instantly right from the Google search. To this end, these pages have to be built as AMPs.

Example of an AMP page

AMP does two major things:

Websites built as AMP use a stripped-down version of HTML and use a JS loader to render fast and load as much resources as possible asynchronously. Google caches the websites in the Google CDN and delivers them via HTTP/2.

The first means in essence, AMP restricts your HTML, JS and CSS in a way that pages built this way have an optimized critical rendering path and can easily be crawled by Google. AMP enforces several restrictions, for example all CSS must be inlined, all JS must be asynchronous and everything on the page must have a static size (to prevent repaints). Although you can achieve the same results without these restrictions by sticking to the web performance best practices from above, AMP might be good trade-off and help for very simple websites.

The second means that Google crawls your website and then caches it in the Google CDN to deliver it very fast. The website content is updated once the crawler indexes your website again. The CDN also respects static TTLs set by the server but performs at least micro-caching: resources are considered fresh for at least one minute and updated in the background when a user request comes in. The effective consequence is that AMP is best applicable in use cases where the content is mostly static. This is the case for news websites or other publications that are only changed by human editors.

Progressive Web Apps

Another approach (also by Google) are progressive web apps (PWA). The idea is to cache static parts of a website using service workers in the browser. As a consequence, these parts load instantly for repeated views and are also available offline. Dynamic parts are still loaded from the server.

The app shell (the single page application logic) can be revalidated in the background. If updates to the app shell are identified, the user is prompted with a message asking him to update the page. This is for example done in Inbox by Gmail.

However, the service worker code caching static resources and doing the revalidation comes with considerable effort for every website. Furthermore, only Chrome and Firefox support Service Workers sufficiently.

Caching Dynamic Content

The problem that all the caching approaches suffer from is that they cannot deal with dynamic content. This is simply due to how HTTP caching works. There are two types of caches: Invalidation-based caches (like forward proxy caches and CDNs) and expiration-based caches (like ISP caches, corporate proxies and browser caches). Invalidation-based caches can be invalidated proactively from the server, expiration-based caches can only be revalidated from the client.

The tricky thing when using expiration-based caches is that you must specify a cache lifetime (TTL) when you first deliver the data from the server. After that you do not have any chance to kick the data out. It will be served by the browser cache up to the moment the TTL expires. For static assets it is not such a complex thing, since they usually only change when you deploy a new version of your web application. Therefore, you can use cool tools like gulp-rev-all and grunt-filerev to hash the assets.

But what do you do with all the data which is loaded and changed by your application at runtime? Changing user profiles, updating a post or adding a new comment are seemingly impossible to combine with the browsers cache, since you cannot estimate when such updates will happen in the future. Therefore, caching is just disabled or very low TTLs are used.