Gatsby has always been, and will always be, focused on performance. All of the best practices and patterns relating to performance are internalized and we enable these performance optimizations by default for every Gatsby application. However, there is always more we can do and we are always striving to make incremental improvements that impact every Gatsby user. From this basis, we're happy to announce a new performance improvement: splitting the page manifest into individual files for each page. Prior to this change, for large Gatsby applications (e.g. more than 5,000 pages), the page manifest could grow to 200Kb or more, and loading this manifest could take several seconds on 3G connections, which is certainly non-ideal!

Over the past few months, I've been gradually changing Gatsby's architecture so that the size of the site has absolutely no impact on real-world performance. This change has been merged and is available, for free, in Gatsby v2.9.0. From this point forward, your application manifest will no longer grow proportionally to the number of pages in your Gatsby application.

In this post, I'll dive deep into the technical intricacies of what was actually causing this slow down and how we fixed the growing page manifest problem.

Symptoms

There were two main symptoms experienced by users of large Gatsby sites.

After navigating to a Gatsby site, it took a while for all the JavaScript to load so that it was "interactive". Therefore, clicks immediately after loading the page could take many seconds to actually navigate. Even after the initial load, clicking links could be somewhat laggy even though the target's resources had already been pre-fetched.

The Problem: Global pages manifest

The central problem was that Gatsby generates a file for each build called pages-manifest.json (also called data.json ) that must be loaded by the browser before users can navigate to other pages. It contains metadata about every page on the site, including:

componentChunkName : The logical name of the React component for the page

: The logical name of the React component for the page dataPath: The path to the file that contains the page's graphql query result, and any other page context.

When a user clicks a link to another page, Gatsby first looks up the manifest for the page's component and query result file. Gatsby then downloads them (if they haven't already been prefetched), and then passes the loaded query results to the page's component and renders it. Since pages-manifest contains the list of all pages on the site, Gatsby can also immediately show a 404 if necessary if the page is not able to be located.

This works great for small sites, but as a Gatsby application grows, so too does the size of the page manifest. The bigger the manifest gets the more data the browser has to download before any UI navigation can occur leading to slowdowns in important metrics like Time to Interactive (TTI).

Even after the manifest had been loaded, the manifest had to be searched for the matching path. This was necessary since pages can be declared with a matchPath (a Regular Expression used to match client-only paths). Huge manifest files resulted in perceptable lag when clicking links too!

Solution: Eliminate the monolithic pages manifest!

The solution seems abundantly obvious at this point. We needed to introduce a manifest file per page, instead of a global pages-manifest. We called this page-data.json . It includes:

componentChunkName : The logical name of the React component for the page

: The logical name of the React component for the page result : The full graphql query result and page context

: The full graphql query result and page context webpackCompilationHash: Unique hash output by webpack any time user's src code content changes

This is very similar to each entry in the pages manifest. The major difference being that the graphql query result is inlined instead of being contained in a separate file.

Now, when a page navigation occurs, Gatsby makes a request directly to the server for the page-data.json , instead of checking the global manifest (which doesn't exist anymore).

Below is a webpagetest.org comparison of gatsbyjs.org (which has about 2,500 pages) before and after the change. As you can see, the First Interactive has been reduced by 432% (5.011 seconds saved), and almost all other metrics have improved as well.