Introduction

This article looks to answer the question of how widely adopted ‘security.txt’ has become, 3 years on from when it was first drafted.

What is security.txt?

In short, “security.txt” is similar to robots.txt but intended to be read by humans wishing to contact a website’s owner about security issues. (Wikipedia)

The summary on securitytxt.org, the website that promotes the proposed standard.

“When security risks in web services are discovered by independent security researchers who understand the severity of the risk, they often lack the channels to disclose them properly. As a result, security issues may be left unreported. security.txt defines a standard to help organizations define the process for security researchers to disclose security vulnerabilities securely.”

The idea is simple: instead of looking for a bug bounty program on dozens of platforms, looking for or guessing for an address mail, looking for a page describing the disclosure policy, etc. just check /.well-known/security.txt that will point you to the right place and tell you who to contact.

Security.txt is not yet an RFC but is currently an Internet draft and there is a public repository if anyone wants to contribute.

Technically, the file should be placed under /.well-known/security.txt or if not possible or as a fallback under /security.txt. Why under /.well-known/? To avoid polluting the root directory of the web server with tons of files. Also the security.txt file should be served over HTTPS and a Content-Type of text/plain.

The first Internet draft was submitted by Edwin Foudil (EdOverflow) in 2017, since a total of 9 drafts were issued.

The 1st of April 2020, the video maker LiveOverflow made a video about security.txt and made an April fools joke claiming he was the creator because people constantly confuse EdOverflow vs. LiveOverflow.

Study

Examination of the adoption of security.txt by the largest websites in the world.

It’s impossible to know how many security vulnerabilities would have been reported if the researcher had a simple, incentivized mechanism for doing so. Traditionally, researchers have reached out to generic email address eg. [email protected] where they are frequently ignored. The fact that Contact is the most used field in security.txt highlights that this is the most important component for both the researchers and the companies.

Prerequisites

To study the adoption of security.txt among the most visited websites, I downloaded the Top 1 Million sites list either from Alexa.

$ wget http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

$ wget https://statvoo.com/dl/top-1million-sites.csv.zip



$ unzip top-1m.csv.zip

$ unzip top-1million-sites.csv.zip

Note: the list name suggests that it contains 1M entries but in fact contains only ~700k (704 069) entries.

What can we look for and how?

Note: I made a side repository that contains the scripts used for the study.

The idea is simple: fetch security.txt from top 1000 most visited websites in the world, either from /.well-known/security.txt or /security.txt. From here I will document who has used it and how (looking at the fields utilized).

So we will try to collect the data below for each website:

{

“security.txt”: true,

“acknowledgments”: true,

“canonical”: true,

“contact”: true,

“encryption”: true,

“expires”: true,

“hiring”: true,

“policy”: true,

“preferred-languages”: true,

“signed”: true

}

On the repository mentioned earlier (you can check the fetch.rb script).

It was not as easy as anticipated to collect this information from the sites, some of the issues I faced included –

TCP timeouts

HTTP timeouts

HTTP redirections

HTTP vs HTTPS

Connection resets

Unknown domains

IP geolocation / whitelisting and so connection refused

Compression errors

Using only deprecated SSL ciphers (eg. SSLv3)

Routing errors

All kind of bad requests and unexpected errors

This is why the fetch.rb script looks over-complicated for a script that just need to fetch a file. And because handling all those kinds of connection consumes time I decided to fetch data only for the Top 1000 as the Top 1M would have required days and days.

Statistics

You can use the script analyse.rb in the repository mentioned earlier to analyse the data fetched in results.json.

I made some charts to present the results in a more visual way.

You can observe the adoption of security.txt among the 10, 100 and 1000 most visited websites. For the top 1000, the adoption rate sits at around 10%, though the likelihood of adoption increases as the popularity of the website increases – a particularly encouraging trend to see.

We will now explore which fields are used among the websites that do use security.txt.

Top 1000 Yes No Acknowledgments 9,6 90,4 Canonical 8,6 91,3 Contact 86,5 13,5 Encryption 54,8 45,2 Expires 0 100 Hiring 66,3 33,6 Policy 66,3 33,6 Preferred-Languages 13,5 86,5

Top 100 Yes No Acknowledgments 21,4 78,6 Canonical 14,3 85,7 Contact 100 0 Encryption 57,1 42,8 Expires 0 100 Hiring 85,7 14,3 Policy 92,8 7,1 Preferred-Languages 14,3 85,7

Top 10 Yes No Acknowledgments 50 50 Canonical 0 100 Contact 100 0 Encryption 50 50 Expires 0 100 Hiring 100 0 Policy 100 0 Preferred-Languages 0 100

As expected Contact and Policy are largely used but Expires is not even used once in the Top 1000.

Among the 104 websites using security.txt in the top 1000, only 1 signed the file with PGP.

Below, the top 10 websites using security.txt and their position –

Top 10 domains supporting security.txt 1,google.com 4,facebook.com 31,google.com.hk 39,livejasmin.com 48,whatsapp.com 49,google.co.in 60,linkedin.com 68,dropbox.com 74,bbc.com 76,google.co.jp

For fun I also created a pixel map of websites using security.txt. A blue pixel means the website is using it and an orange pixel means it does not use it.

As there are 1000 websites I wasn’t able to create a square image and instead made a 40×20 one. But 40×20 pixel is very small so I up-scaled the image with ImageMagick: convert pixel.png -scale 2000% pixel_scale.png.

Alternative Methods for Finding Security.txt

Shodan

Shodan supports security.txt and parses it to display it as the “Security Contact” section on each host result. The raw data is available under the http.securitytxt banner and can be used as a filter.

Filter to find hosts using security.txt (excluding redirections):

http.securitytxt:contact http.status:200

Google dorks

Only 152 indexed results (19th of April 2020)

inurl:”/.well-known/security.txt” filetype:txt

inurl:”/.well-known/security.txt” filetype:txt intext:”—–BEGIN PGP SIGNED MESSAGE—–“

Bing dorks

“/.well-known/security.txt” filetype:txt

“/.well-known/security.txt” filetype:txt inbody:”—–BEGIN PGP SIGNED MESSAGE—–“

Baidu dorks

inurl:”/.well-known/security.txt”

Should we support further adoption of Security.txt?

While it is not something that many security researchers look out for or are aware of (at the point of writing this article) security.txt is taking steps in the right direction.

“One of the biggest problems security researchers are faced with to date is a choice between serious effort and potential legal threats for the privilege of doing someone else a favor, against making some money and taking the credit through an alias. It becomes easy to understand why some black hat markets have over 1 million registered users. ” – Material from TurgenSec Ethics Policy

Security.txt represents progress as a standardized signpost for security researchers to make disclosures with less time cost. Furthermore, attached policy notices help to reassure researchers of the organizational stance and reduce fear of threats.

Not having a security.txt file and hard to find security contacts, or no visible policy is like not giving your postal address to someone that wants to send you a gift card.

How this can be supported?

Push for integration into web frameworks and CMSs (Drupal already supports it, WP relies on a little-known plugin).

Inclusion in seclists and security scanners.

Ensure that you link to the list from somewhere else on the site (help the crawlers).

Push for integration into CDN and cloud provider services (CloudFlare supports it in its Workers)

Human pressure within organizations to support ethical security research & bug bounties – spreading the word.

About the Author

This piece was written by Alexandre ZANNI aka noraj and edited by TurgenSec. Alexandre is a pentester, staff member of the RTFM association and a BlackArch dev. His collaborations with TurgenSec have received international attention within cyber security.

Website: pwn.by/noraj