What’s really wrong with node_modules and why this is your fault

44,940 reads

I was never really concerned about the size of node_modules — my thinking was that you should not care too much about the tools you need to do the job. If you need a 20 kg hammer to drive a nail, you just take it. The same story with the node_modules , it may weight a few kilobytes or a few megabytes because our imaginary hammer comes with a set of heavy nails, right? Well, maybe, in theory.

Let’s drop that fancy analogy and look at real-world examples. I will examine some @angular/cli dependencies — but only because it’s quite a big library. I don’t want to make it look bad — it’s just a good representation of an average package. I installed it in the empty directory using npm@5.5.1. Npm reported “added 976 packages in 107.13s” after installation (that’s 141 megabytes on disk).

Okay so cli package it’s quite a robust library and its list of dependencies is a bit too long but perhaps all of them are needed. Let’s focus at the first selected package common-tags . Quick look at its documentation and you can say that it’s some kind of utils library with a number of common methods to work with the text. So far so good — general methods, easy to reuse.

Just one little flaw — in deps of common-tags we see babel-runtime . A bit surprising, we just need some common text functions but — hey — it’s 2017 JS for you. Oh wait, it turns out that it wants core-js and regenerator-runtime . Fortunately it ends here and — what’s more — core-js is also utils library, quite a big one honestly! It has so many functions inside I bet a lot of other packages will be using it!

Not really. Only babel-runtime has it in its deps. Oopsie.

And returning to the starting point, cli uses only 3 (trivial) methods from common-tags — stripIndents , stripIndent , oneLine . Oopsie daisy.

In order to use these 3 methods node_modules needs 1826 files. And that’s just 4 of mentioned 976 installed packages.

This is your dream about lightweight package collapsing

The next dependency is core-object — it downloaded in total of 8 packages and 45 files — so not so bad. And other packages use these files too, mostly chalk .

The real bummer is 6 of these 8 are dependencies of chalk and chalk is used only once in core-object to paint yellow deprecation message.

Other random findings:

few packages to tackle the topic of “querystring”

some attempts for assert methods varying in complexity from minimalistic-assert to assert-plus

to dozens of various is-* packages

packages a lot of packages to “prettifyfifafiying-whatever” errors and console prints

of packages to “prettifyfifafiying-whatever” errors and console prints hundreds of polyfills/shims or reimplementation of native methods

of course previous asserts sit in node_modules next to full lodash

next to full … and some partial methods from lodash as separate deps

And, by random, I mean just picking some packages and brief searching for similar ones which was often really easy because the packages had similar names.

Some of the found duplicates

Let’s stop here. I bet other dependencies are necessary and well thought.

That is nothing new for JS devs, it has been like that for a while now and this situation shouldn’t be accepted— size of node_modules is a topic of jokes and removal of packages like “left pad” is the cause of disasters.

So how can it be fixed? By creating the proper Standard Library.

Proper means that it should be complete, containing a variety of common functions to operate on text, numbers, collections and a lot of functions, so that during 99.99% of the time you won’t need any other library. Is that fantasy? I don’t think so. Taking as the base some of the mentioned utils libraries and merging it with others would be a great start.

There is only one problem and that’s not even a technical problem — creating such a package would need someone to take a position of the leader. I think rather about absolute power than democracy — if you want to know how democracy handles this problem look again at your node_modules. It needs a powerful leader because it needs a solid plan, not just months of discussions. We already have all packages implemented, we just need to glue it together in a logical way.

Right now, in pursuit of reusability and “keeping it DRY” typical node_modules directory ended up being completely WET. Just because we thought that dozen of packages with overlapping functionalities were better than one well planned library.

Which is the bigger number, five or one? One army, a real army, united behind one leader with one purpose.

Just a small note about jQuery — not so long ago, jQuery was in almost every project. Why? There were a number of solid reasons:

It provides a set to commonly used functions in a convenient to use form. jQuery methods were easily chainable with each other (as result of being developed by one organization).

It was widely known by everyone, so joining a project was easy, as there was no extra learning curve.

Although some people complained about the size of jQuery it was mostly irrelevant as it was often loaded from CDN — so the for 90% of time it was already present on user’s computer.

The last point is very important — it really doesn’t matter how big some library is if you don’t need to download it. I think about lodash — it really has tons of functions that can replace a lot of unnecessary dependencies. With its modular structure and lack of dependencies is a great library to choose if you need something that is missing in JavaScript.

Picking some smaller library only because of its size is the same level of evil that you do when with some premature optimization practices — ending up with no performance gain and the confusing code. Same here — smaller, unorganized packages leads to redundancy, incompatibility and overall much bigger node_modules size.

On the vertical axis is a size in KB, on horizontal individual packages

At the beginning I said that I don’t care about the size of node_modules and this is partly true — I don’t care so much about space it takes, but I do care about the number of files. In case of @angular/cli almost 70% of disk space is the 20 biggest packages (Pareto principle works everywhere!).

What’s more — if you take a look at WinDirStat’s report below you will see that a lot of packages contain big files (for example sourcemaps, these are green). And in terms of copying files computers works better in case of a few big files than thousands of tiny ones.

The node_modules report from WinDirStat. Packages in the right bottom corner can have only fractions of Planck length

A large number of packages has one more drawback —the potential version incompatibility. Let’s assume each package has two versions 1.0 and 2.0. In the worst scenario some packages may need 1.0 and the others 2.0. The more packages are in our app, the more possible combinations you get. Resolving those combinations takes time, CPU and space to keep every occurrence of the old version needed by some equally old package.

Epilogue

This is the world we live in. The worst thing is that “it kinda works” so it’s not going to change anytime soon. Creating JS Standard Library would help but a real change is needed in developers mindset.

So if you can remember one thing from my looong article, let it be “use lodash”. And if you can, take also “use popular packages that already are used by others”. Programming is not an Individuality Contest so choose tested, war-seasoned libraries and do not increase JavaScript entropy.

Mandatory joke, I almost forgot about it

Tags