These results shouldn’t be shocking to anybody; I’m just experimenting with click-bait headlines, which I hear are all the rage in corporate-sponsored dev blogs.

Last time, we discussed some of the potentials risks of injecting arbitrary third party resources onto your website. To demonstrate the widespread acceptance of the vector, we ran an analysis of websites featured on ProductHunt going back a full year.

We ran a Python script that pulled links for each product featured on ProductHunt from the API, and then invoked a PhantomJS shell to record all of the resources loaded, along with their origin domain and mime-type.

Of course, there are plenty of secure use-cases for loading resources from external origins, in particular delivering assets via CDNs, leveraging internal and external APIs, and ad delivery / tracking. That being said, a large number of origins is a bit of a smell and the blind acceptance of the practice likely represents a real security risk.

We did our best to eliminate products that weren’t websites, and pruned domains that showed up more than 8 times, as most of these represented shared platforms of one type or another. Overall, we sampled 6,729 webpages.

Of these, slightly more than half were still sending information over the wire without HTTPS. This really isn’t the place to discuss why that’s a problem; I will point out that it’s a good opportunity for anybody along the network path (looking at you, skeevy coffee shops) to exercise full control over the page. At that point, they can instrument whatever scripts they want.

Schemes used to load the landing page

When it comes to loading resources, ProductHunters seem to love doing it. The mean number of resources loaded was 101.4 (including images, javascript, css, xhr, and so on), with a median of 81. This sheer number really just increases the number of round-trips. There’s plenty of justifiable reasons for high numbers here, but the general wisdom is that reducing these will improve performance. See Appendix A for the top 10 offenders.

Histogram of total number of resources loaded on the page

As far as JavaScripts go, we found there were a wide variety of mime-types people used to indicate Javascript as the desired format, so we filtered down to just those. The mean number of Javascripts loaded was 24.6, with a median of 18. That’s a lot higher than I would have hoped, but makes sense in a world where using many Javascript files with fancy requirement management libs is fairly common. My preference is generally to reduce these round-trips by using monoliths, but I guess I’m just old-school. See Appendix B for the worst offenders.