Smaller files are downloaded faster, so making an asset file size smaller before sending it to a client is a good thing to do.

Actually, it’s not just a good thing to do, minification and compression are something that a modern developer is supposed to do. But minifiers are not perfect and compressors can perform better or worse depending on the data they compress. There are some tricks and patterns to turn these tools up to eleven. Interested? Let’s dive in!

Getting Started

We’ll use a simple SVG file as an example:

An <svg> image with two 6×6 squares ( <rect> ) inside a 10×10 pixels area ( viewBox ). 176 bytes raw, 138 b gzipped.

Yup, it’s not a piece of fine art. But it’s enough to cover the topic without turning this Medium post into a scientific paper.

Step 0: Svgo

Running svgo image.svg instantly improves the compression.

(Carriage returns and indentations are added for readability)

The most notably, the rect s were replaced with path s. A path shape is defined by its d attribute, a sequence of commands that moves a virtual pen just like canvas drawing methods. Commands can be absolute (move to x, y) and relative (move by x, y). Let’s take a closer look at one of the paths:

M 0 0 : start at (0, 0)

h 6 : move horizontally by 6 px right

v 6 : move vertically by 6 px down

H 0 : move horizontally to x = 0

z : close path: move to the point the path was started

Quite an elaborate way to draw a square! But it’s a more compact representation than a rect element.

The other change is that #f00 became red . One byte less, yay!

The file is now 135 b raw, 126 b gzipped.

Step 1: Scale Everything

You might have noticed all the coordinates in both paths are even. What if we divide each coordinate by two?

The image now looks the same, but it’s twice as small. Now we can just scale the viewBox and the image looks correct again.

133 bytes raw, 124 bytes gzipped.

Step 2: Unclosed paths

Back to the paths. The last commands in both paths are z , “close path”. But paths are implicitly closed when they are filled. So we could just remove those commands.

2 raw bytes less, now the file is 131 b long, 122 gzipped. Fewer raw bytes makes fewer compressed bytes, seems legit. And we’ve already saved 4 gzipped bytes even after svgo.

You might wonder: why doesn’t svgo make these optimizations automatically. The reason is that scaling an image and removing the trailing z commands are unsafe. Here, take a look:

Various versions of the image with the stroke applied. Left to right: original, unclosed, unclosed & scaled.

Strokes are all messed up. It’s good to know we’re not going to use strokes. Svgo cannot know that, so it has to play safe, avoiding potentially unsafe transformations.

Looks like there’s nothing else to remove from the code. The XML syntax is strict, all the attributes are required and its values cannot be left unquoted.

Is that all? Oh, no, it’s just the beginning.

Step 3: Reducing the Alphabet

Now it’s time to introduce a very handy tool, gzthermal. It analyzes the gzipped file and colors the raw bytes depending on how many bits are used to encode. Better compressed data is green, worse compressed one is red, it’s that simple.

Let’s take a look at the d attributes again. Particularly at the M commands as they are marked red and worth our attention. No, we cannot delete those, but we can make it a relative command: m2 2 .

The initial “cursor” position is the axis origin, (0, 0), so there’s no difference between moving to (2, 2) and moving by (2, 2) from the origin. So, let’s try that.

Still 131 bytes raw, but 121 bytes gzipped. Whoa! What just happened? The answer is…

Huffman Trees

Gzip is powered by the DEFLATE algorithm, and DEFLATE is built on top of Huffman trees.

The core idea of Huffman coding is that more frequent symbols are encoded with fewer bits, and vice versa, less frequent symbols need more bits.

Yes, bits, not bytes: DEFLATE treats a string of bytes just as a sequence of bits, and if there were 7, or 9, or 100 bits in a byte, DEFLATE would work just the same.

As an example we’ll take a string Test and construct the codes from its alphabet:

00 T

01 e

10 s

11 t

Now to encode the string Test we just write out the bits for each character: 00011011 , 8 bits.

Now let’s make an initial letter T lowercase, test , and try again:

0 t

10 e

11 s