Last week, we released our first annual State of Open Source Security report. One of the discoveries the report mentions is that an analysis of around 433,000 sites found that 77% of them use at least one front-end JavaScript library with a known security vulnerability. This number mirrors the one we reported back in March, but thanks to Google Chrome’s Lighthouse now testing for vulnerable JavaScript libraries using Snyk, we can get much more thorough results.

Lighthouse data is collected as part of HTTP Archive, and the data is available for querying through BigQuery. As a result, we get to query Lighthouse audit data on a very large scale.

Looking at how many sites are vulnerable

The October 15th data (the most recent run available) on BigQuery contains data collected from 439,176 different urls. After you account for urls where Lighthouse was unable to run, or the audit itself didn’t complete for whatever reason, we get a dataset of 418,112 different sites to query against.

The first question is how many of those sites carry known vulnerabilities. We can run a query against the reports to get that information:

SELECT JSON_EXTRACT_SCALAR(report, "$.audits.no-vulnerable-libraries.score") AS score, COUNT(0) AS volume FROM [httparchive:har.latest_lighthouse_mobile] WHERE report IS NOT NULL GROUP BY score HAVING score IS NOT NULL ORDER BY score

The results are very much in line with our smaller scale study back in March: 77.3% (323,132) of those sites failed the audit. In other words, 77.3% of those sites contain at least one client-side JavaScript library with a known security vulnerability. The new version of the HTTP Archive site will report on how this changes over time.

We can drill-down even more to see how many known vulnerabilities are being carried by those libraries:

SELECT REGEXP_EXTRACT(JSON_EXTRACT_SCALAR(report, "$.audits.no-vulnerable-libraries.displayValue"), r'^\S*') AS knownVulnerabilities, COUNT(0) AS volume FROM `httparchive.lighthouse.2017_10_15_mobile` WHERE report IS NOT NULL AND JSON_EXTRACT_SCALAR(report, "$.audits.no-vulnerable-libraries.score") = 'false' GROUP BY knownVulnerabilities ORDER BY CAST(knownVulnerabilities as int64)

It turns out, that if you carry at least one known vulnerability, you likely carry more. 51.8% of vulnerable sites carry more than one known security vulnerability. While the majority of those sites carry one or two, the long-tail is scary. 9.2% of sites carry libraries with a combined four or more known security vulnerabilities.

Which libraries are the most often found to be vulnerable

Using the Lighthouse audit data, we can also get an idea of which libraries are most commonly found to be vulnerable.

First, we can query to see which libraries are detected most often—whether they are vulnerable or not. The following query grabs the ten most commonly found libraries:

CREATE TEMPORARY FUNCTION getLibs(items STRING) RETURNS ARRAY<STRING> LANGUAGE js AS """ try { return items.match(/"name":"([^"]*)"/ig); } catch (e) { return []; } """; SELECT library, COUNT(0) Volume FROM ( SELECT getLibs(JSON_EXTRACT(report, "$.audits.no-vulnerable-libraries.extendedInfo.jsLibs")) AS libs FROM `httparchive.lighthouse.2017_10_15_mobile` ) CROSS JOIN UNNEST(libs) AS library GROUP BY library ORDER BY Volume DESC LIMIT 10

Library Number of times detected Adoption % jQuery 344,643 82.4% jQuery UI 83,075 19.9% Modernizr 63,122 15.1% Bootstrap 57,154 13.7% yepnope 41,537 9.9% FlexSlider 33,002 7.9% Underscore 17,633 4.2% Google Maps 14,312 3.4% Moment.js 14,038 3.4% SWFObject 13,521 3.2%

Unsurprisingly, jQuery tops the list. This is right in line with what we saw back in March, and what you would probably expect. No library yet has come close to reaching jQuery’s universal appeal. One caveat here: React is currently being underreported. Once the updated detection script has been pulled into Lighthouse, its numbers will increase (and the overall percentage of vulnerable sites will likely increase slightly as well).

Now, let’s change it up and look at which libraries are found to be carrying known vulnerabilities.

CREATE TEMPORARY FUNCTION getLibs(items STRING) RETURNS ARRAY<STRING> LANGUAGE js AS """ try { return items.match(/"name":"([^"]*)"/ig); } catch (e) { return []; } """; SELECT library, COUNT(0) Volume FROM ( SELECT getLibs(JSON_EXTRACT(report, "$.audits.no-vulnerable-libraries.extendedInfo.vulnerabilities")) AS libs FROM `httparchive.lighthouse.2017_10_15_mobile` ) CROSS JOIN UNNEST(libs) AS library GROUP BY library ORDER BY Volume DESC LIMIT 10

Test your JavaScript packages for vulnerabilities By submitting this form you consent to us emailing you occasionally about our products and services.You can unsubscribe from emails at any time, and we will never pass your email onto third parties. Privacy Policy

The top couple of names on the list are very similar.

Library Number of times found vulnerable % of all instances of this lib detected jQuery 318,786 92.5% jQuery UI 74,486 89.7% Moment.js 10,245 73.0% AngularJS 7,609 84.8% Handlebars 3,129 60.7% Mustache 1,925 51.0% YUI 3 559 40.3% jQuery Mobile 413 3.7% Knockout 407 19.6% React 181 10.2%

Looking at the percentages doesn’t paint a rosy picture. 92.5% of jQuery versions, the most popular library on the web by far, in production carry a known security vulnerability. In fact, of the ten libraries most commonly found to be carrying a known vulnerability, six of them are vulnerable in the majority of versions found in production.

This is the case despite the fact that every one of the libraries on this list has versions available that do not carry these vulnerabilities.

Library Oldest Version with No Known Vulnerabilities Release Date jQuery 3.0.0 June, 2016 jQuery UI 1.10.0 January, 2013 Moment.js 2.15.2 October, 2016 AngularJS 1.6.1 December, 2016 Handlebars 4.0.0 September, 2015 Mustache 2.2.1 December, 2015 YUI 3 3.10.3 June, 2016 jQuery Mobile 1.2.0 October, 2012 Knockout 3.0.0 October, 2013 React 0.14.0 October, 2015

Each of the front-end libraries most commonly found to be vulnerable has been free of known vulnerabilities for anywhere from one to five years. The reality is that front-end libraries and frameworks often don’t get updated after they hit production.

Reason for Hope

The picture is a bit grim right now—there’s no way to deny it. While this data doesn’t mean that all 77% of these sites are exploitable (it’s possible they could be avoiding the vulnerable methods), that’s a small consolation. That’s 77% of sites that are one developer making one method call away from being vulnerable. As we’ve seen in 2017, open-source vulnerabilities need to be taken very seriously.

But there’s also a bright side. While there are a large number of vulnerabilities in production, those vulnerabilities have been addressed in the libraries themselves. Each of the major libraries has versions available that are free of known security vulnerabilities—we just need to get them into production.

To get to a better situation, we need a few things to happen. The first is improved tooling and tooling adoption. According to our State of Open Source Security survey, 38% of people using open-source don’t use any sort of automated tools to help keep their packages up to date. I am willing to wager that if you were to look specifically at front-end JavaScript usage, you would see even lower adoption.

That number should improve. Improvements to npm and Yarn have made front-end package management much simpler for developers. Pairing a solid package management workflow with tools—like Snyk—that will help you to find, prevent, fix and monitor those packages for dependencies will go a long way towards making the web more secure.

The second thing we need is for an increase in the general awareness and understanding of the problem. It’s why we published the State of Open Source Security report—to shed light on the challenges faced in securing open source and help find ways we can improve.

Having the vulnerable libraries audit in Lighthouse (and Sonar) also helps. These tools make it much easier for developers to spot issues on the sites they build. And thanks to the HTTP Archive and BigQuery, we have easy to access data to help us see how the problem scales.

While the data right now isn’t encouraging, improved awareness and improved tooling make this a solvable problem for the future.