How Github became the web's largest font piracy site (and how to fix it)

Fonts, like other software, are available in two flavors: commercial, and free. Free fonts come with a license that allows you to use them, well, for free. Some licenses even allow you to edit the font and redistribute it under another name.

Commercial fonts, on the other hand, you have to buy a license for. This license tells you what you can and can’t do with the font. Most notably, you can’t give it to somebody else — not unless they buy their own license, you buy it for them, or the license allows you to transfer the license. If you want to use a font on your website, the license must allow web usage in the first place. You can’t usually just use a “desktop font” (bought with a license for use in applications like Photoshop or InDesign) in your web project. Webfonts often charge depending on the amount of visitors your site has, and all of this is determined in the license you buy.

When a commercial font ends up where it shouldn’t — for instance on a huge, freely searchable database of code — it’ll most certainly break the terms of the license, and once downloaded from there, it becomes a pirated font.

Got Helvetica?

Let’s use the Github search API and see if we can find the most ubiquitous commercial font on the planet: Helvetica. And yep, more than 100,000 copies are findable on Github. (This link will only work when you’re logged into Github.)

A simple search shows that over 100,000 copies of Hevetica can be found on Github

We’re only searching for TTF or OTF files — no WOFF/WOFF2 or legacy formats like EOT and SVG (you’d get almost 150.000 hits otherwise). Granted, this also finds other versions of Helvetica, like Neue Helvetica (often bastardised as Helvetica Neue), or Helvetica in different weights, like helvetica-bold.ttf . But still — that’s over 100.000 files that shouldn’t be there.

What else is on Github?

One of the biggest and best established sellers of commercial fonts is MyFonts. Their collection currently contains over 33.000 font families, almost all of them commercial fonts, i.e. with licenses you have to pay for. They sell, for instance, about a dozen versions of Helvetica.

What if you search for MyFonts’ products on Github?

That’s exactly what I did. I skipped generic names that could result in false positives: names like Black, Latin or Text and fed the rest to the Github search API.

The result? Of the deduped list of 29,951 fonts, 7,617 were present on Github — that’s a quarter of the entire MyFonts collection. Of their fonts labeled “bestseller”, 39 out of 49 can be found on Github, as well as 28 of the 30 labeled “top webfont”.

A total of 316.358 unique repositories had one or more of these fonts stored away on Github. And this is a very conservative number, as the Github API returns only the first 1000 results. For Helvetica, this leaves 99,000 results I can’t check which repo they are in.

Note that I only searched for OTF or TTF fonts. No forks, no branches other than master, and no older commits were searched. I also queried for the font name as presented on the MyFonts website, so renamed or obfuscated versions weren’t found.

Here’s a list of most uploaded commercial fonts on Github, plus the number of results as of September 2017:

Helvetica (100,194) Proxima Nova (67,810) Myriad Pro (38,794) Avenir (32,327) Museo (31,825) Lucida (27,225) Futura (20,872) Fraktur (18,908) Nexa (7,071) Courier (6,644)

But it’s not just MyFonts, of course. Even fonts from independent foundries end up on Github. TypeNetwork, a growing alliance of independent type designers from around the world, has roughly half of its collection on Github.

How did all these fonts end up on Github?

“Never attribute to malice that which is adequately explained by developers simply not realising the consequences of adding licensed fonts to their public repos”, a brand new saying goes.

You set up your new web project, plunk your webfonts alongside other assets like images and JavaScript libraries, and commit your stuff to a public repo on Github. Not realising that this violates the license of the commercial fonts, you now made them available to anyone who can use Github’s search function. It happens — apparently.

Fixing the problem

You might be reading this and thinking “oh heck, I might’ve accidentally done this”. You can remove the offending files, commit the change and the fonts will be gone:

$ git rm assets/fonts/helvetica.otf

Well, not entirely gone: if you checkout one of the commits before the deletion, the fonts will still be there. To permanently remove them from your repo, you can use git filter-branch , or the much more user friendly tool BFG Repo Cleaner:

$ bfg --delete-files assets/fonts/helvetica.otf

This will remove the fonts from your repo — but it won’t do so for forks or clones. To remove them there, you could contact the owner or submit a pull request yourself.

Take heed: purging files like this will rewrite your repo’s history. Github has more thorough information on permanently deleting files.

How to deal with commercial fonts, then?

So, what to do when you want to use commercial fonts and be legit about it?

Long story short, you can’t commit them to a public repository. There’s really no way around it: if you are not allowed to share the font with folks who don’t fall under its license, you can’t make it a part of your repo.

Your options are to go for a private repo, or keep a note in your public repo telling the folks who’ll be using your project to license the fonts and add ‘em themselves. Stick your assets/fonts directory in a .gitignore so you’ll never accidentally commit them, and Bob’s your uncle.

Be cool

There are, of course, bigger sites out there that facilitate font piracy. The difference with Github is that most fonts ended up there because people didn’t realize they were doing something wrong. I have a hard time believing all those 33,000 developers went “har har, let’s upload some w4r3z!”

So be cool about it: check your repo and make sure there’s no font there that shouldn’t be there.

Thanks to Indra Kupferschmid, Bram Stein and Stephen Coles for their feedback on earlier drafts of this article.