640 kB ought to be enough for anybody.

Edward Scissorhands, Senior Splitter.

Click here to share this article on LinkedIn »

Code splitting is a popular rumour these days. To be more clear –

Code splitting is not the thing you should know. Code splitting is the thing you shall use.

And here we got a problem – from one point of view – the modern code splitting is quite easy thing to do – just use dynamic import . From another point of view – what you actually want to archive?

What do you want to split. What do you need to defer?

The are two answers – JavaScript code and assets. And assets could also be a javascript code these days, you know.

But in other words the difference is simpler – webpack controlled resources and not controlled, like a static images.

Now we will forget about the second part, as long it is a CDN’s duty to handle it.

And what about the first part? About code splitting for JS-bundles? I could say — You get it wrong. Not get wrong the idea – you just forget to check some realisation details.

Spoiler – the more you use code splitting, the less code splitting you may achieve.

And the root issue is code duplication, or code de-duplication, you also want to achieve.

Let me tell you a story you never heard before. Before… actually there were no problems before.

Before.

There were no modules before. No modules == no worries. Only script tags.

And everyone was creating a HUGE single javascript file, and or including a dozen small files one by one in the source HTML.

“Modules” were invented later. At least 10 years after the JS’s birth date.

I barely remember what meant “module” for google closure compiler or dojo. But the first “normal” commonjs modules comes with nodejs and there was no way to use them in browser.

Follow this link to get a brief into to the JavaScript modules –

As result frontend got 2 options to live:

Use browsery(gulp, webpack) to bundle all sources into old school one big file.

Use AMD/RequireJS/SystemJS and require modules as modules, one by one, from frontend.

And the second way was not the case. It was a way slower. From N to M times slower than a single bundle, where N is number of modules, and M is nesting levels. And M is more important, and long we could load a dozen modules in parallel, but we can’t load their dependencies. Why? We don’t know them.

In median project M is about 10. So RequireJS way is at least 10 times slower. With HTTP/2. Don’t ever try to use Require.js without HTTP/2.

Keeping in mind that HTTP/2 did NOT exists by the time Require.js emerges — so BUNDLERS WINS!

One bundle to rule them all!

Holistic bundle wrapper

Webpack is a de facto standard today. Just accept it. And there is a million different articles how to properly set it up, and then handle. The most complex ones is about long-term caching, extracting common/vendor chunks and splitting a whole application into pack of “async” pieces.

Maybe the best article I read about “CommonChunking” in webpack, is the article by Adam Rackis, and you also have to read it.

So — just promise me, that you will read it. But in short — the article explain how to cut application into the pieces. Properly cut into the correct pieces.

Adam extracted all the things out of entry point, and next he can load only the code he needs “now”. This is amazing example of a handcrafted masterpiece.

Can you see the problem here?

The name of problem is “used-twice”. In this example it will contain all the dependencies used more than once across the the async chunks.

Actually this is quite good. Zero duplication. But you might download some code you dont need right now, coz it is required for another chunk, you will require sooner or later.

You can still manually extract big “dependencies” as separate chunks (as react-dnd in this example), you can tune common chunk creation, trying to keep dependencies between modules and chunks under control..

But you will fail

Sooner or later you will loose control. Application code can be quite entangled, and different modules are ofter address the same modules.

Webpack bundle analyzer could visualize your application. And you may found, that it look like… a graph.

As result — without special control, “used-twice” or a normal “CommonChunkPlugin” will grow each time you introduce a new splitting point.

React-loadable, for example, encourages you to add as many spitting points, as you may found necessary. This is correct, but not “necessary” sometimes.

Code splitting with react-loadable

You will start with rock solid application, asking user to download 10Mb on page load. Next you will split application first time, next you will split second time, defer all the components you can defer, and what will you get in result?

The more chunks you have — the more chunks will use the same modules. The more modules have used twice — the more modules are common and extracted as a common.

This can result moving all the node_modules or most of your application’s code into the CommonChunk. Or both.

It maybe a bit unfair, and the not result you want to achieve.

Code splitting is like mutable state or multiple inheritance — handy tool, and millions have used it. But might get you in a trouble :P

Oh nooo

It is still quite easy to fix. Just run bundler analizer, introduce a few more common-chunks, think about adding, or removing some splitting points.

Keep your handcrafted masterpiece — masterpiece.

Can you see the problem here? It’s a typo in handcrafted. Should be #0CJS.

But default webpack’s automagics will make the things worse.

Actually not “will”, but “were”. Meanwhile CommonChunkPlugin was handy, but hard-to-handle thing in webpack 2–3 — it was just removed from Webpack 4.

Buy the time you read this article — it could be obsolete. Just absolutely outdated.

Webpack v4 rethinks most of the things, and gonna to automagically split not only your code into the chunks, but also split chunks, and solve most of the problems I’ve described above.

As I said — webpack is a de facto standard today. Just accept it.

I don’t have enough information and confidence in a new approach, but it should work close to perfect without any configuration. And you are still able to make it perfect with some configuration.

The only problem here — by the default configuration. Additional chunk will be emerged if:

New chunk can be shared OR modules are from the node_modules folder

folder New chunk would be bigger than 30kb (before min+gz)

Maximum number of parallel request when loading chunks on demand would be lower or equal to 5

Maximum number of parallel request at initial page load would be lower or equal to 3

It may produce more chunks than you need, it may produce fewer chunks than you need. Should work for 99.9% cases, but it is still a good idea, to watch after it.

You may not notice the problem, but webpack is going to solve the problem, commonly known as the knapsack problem or rucksack problem.

Now — pack it!

The only thing, you should know about it —

The decision problem form of the knapsack problem is NP-complete, thus there is no known algorithm both correct and fast (polynomial-time) in all cases.

Another way.

Another way is to work out graphs theory, and splitting on the bridges.

A graph with 16 vertices and 6 bridges (highlighted in red)

Just analize the bundle, find the “cutable” parts… And cut them.

The good part — this is 100% pure #0CJS.

The bad part — there is no way to actually control the splitting process. And applications are usually so entangled, that there might be no bridges at all.

Yet another way

In “yet-another-index”(Yandex) we developed just another way to solve this problem. We have used it in production since 2011, and I still could not understand — why nobody else did the same.

Our problem was simple — we were developing Maps API, similar to Google Maps you might use, and that API was huge. And “one big file”.

Just for your information: Maps API usually include –

“Maps”. Dozen of services and tile engines. “Graphics”. SVG, Canvas, VML, WebGL “Templates”. Like a React or React-dom “Magic”. At least a bit of magic.

You don’t need 3/4 from this list while you site is loading. You don’t need 2/4 from this list.. You just don’t need most of the things usually. Plus we have to support IE6–7–8 among the modern browsers, and don’t want to ship polyfills and and simplified templates to the modern clients. Or reverse.

Solution we found, was straight and simple. It does not require any rocket science or computations on graphs. We just exposed all the dependencies to frontend, and let frontend calculate the list of modules it needs to load another list of modules, ie required by top-level logic.

If module A depends on module B and React — one have to load module A, B and React, to get A.

This problem(it is not a problem) was solved by the ym, or just Yandex Modules. This is just AMD module system, similar to SystemJS. By fact this is not AMD, but LMD(lazy), but this does not count for the article.

['geoXml.util', '4e', '5!'],

['graphics.csg', '4f', '*_8!4T8H'],

['islets.traffic.layout.settings.slider.html', '4g', '1)1w14'],

['theme.islands.control.layout.routePanel.Button.css', '4h', '7u'],

['graphics.render.abstract.shape', '4k', ''],

['graphics.renderManager.canvasTile', '4l', '*X*18H4m4p'],

The exposed module map table, a small part.

The second problem was to actually load the required modules. Not as one bundle, because the goal was not to do it. And not as small chunks, because they just dont exists.

We did not found anything better that just expose “module combiner” into the internet. If you need A, B and React — just load “ combine?module=A,B,React" , and it will concat required modules and send back. What could be easier?

This is a real URL. “Modules” are renamed into 2-symbol-length alias.

Module system with a perfect async chunking and zero duplication is complete. Just put a small proxy layer before it, and it will stand any load.

Modules map creation was handled by ymb, while server and client code generation (aka “webpack-internals”) were handled by yms. Both packages were open-sourced a few years ago. But you can’t use them! 🙀!

Wat?

And here is surprise — you may found that yms literally has ZERO documentation. And ymb’s README is also not a (understandable) thing.

They were not designed to be used by you. So nobody uses them in real.

These tools does not support babel, source maps, long-term caching, TypeScript, CommonJs(that’s why “🙀”), and all that stuff, which makes word class bundler — a world class bundler.

Even if ym/ymb/yms is a handy thing — they were never used outside the product we build. We never had “external” customers for our building tools. Never thought about other teams requirements.

I hope you got the point — dont ever use it.

And I wrote all this, just to let you better undestand how things works.

How Parcel’s common chunking works?

PS: But I still could not understand, why everybody “bundle” static bundles, and nobody uses “active” server-side. Why not?