MetaMask takes security seriously. We also realize that safe adoption of Ethereum requires keeping our users informed and educated.

Our phishing defense process was recently audited by Spyglass Security. In the spirit of transparency, we’re sharing their report and our responses, along with our plans to address some issues they raise.

None of these issues involve the code or security of the core extension, which remains safe as ever to use for reasonable amounts of Ether. These recommendations are based primarily on the phishing detection features we’ve built to protect users against malicious sites and common phishing patterns.

If you’re a security expert who cares about the future of the decentralized web, we’re hiring a Chief Security Engineer and would love to hear from you. If you have questions, please reach out to support@metamask.io.

1. Eth-phishing-detector

MetaMask maintains a repository called eth-phishing-detect that attempts to keep a list of known malicious sites and prevent MetaMask users from visiting those sites. Here’s what Spyglass had to say:

Any projects reusing MetaMask’s eth-phishing-detector code may be putting their users at risk based on their implementation. The MetaMask team should be mindful that other projects will attempt to reuse this code verbatim and adjust accordingly. Be aware implementing a security feature in an open-source project means potentially having some upstream responsibility when other projects reuse code, officially or unofficially. In order to mitigate second-order risk and protect more users, we recommend MetaMask either: - Add authentication to the Infura API, or - Only expose hashed values via the API. The code could be reworked to hash each domain visited by the user and compare this hash to the blacklists as with the Google ‘Password Alert’ Chrome Extension9. Doing this would allow the MetaMask team to make the list data private, which will prevent threat actors from easily identifying whitelisted infrastructure.

Our response:

It is interesting to consider the upstream responsibility when others reuse our open source phishing list — we’ll do our best to improve the list’s documentation and help users ensure best usage.

But from what we can tell, neither recommendation mitigates this problem:

Adding authentication to the Infura API would not add security, because those Infura authentication tokens would be present in our client-side distribution, and easy to reuse for third parties.

While hashing sites could provide privacy benefits from people analyzing our whitelist infrastructure, our whitelist is used to create an automatic blacklist (sites with an unacceptably short Levenshtein distance to whitelisted “frequently targeted” sites). This approach has helped us dramatically reduce a large category of phishing sites with familiar looking names, and we are not convinced the list’s privacy is so important that we should compromise our fuzzy blacklist.

We’re also going to consider creating a version of our phishing detector which includes our auto- updating logic, and encourage users to include that library instead of the list module itself.

2. Whitelist Infrastructure

As an enhancement on top of the blacklist described above, we perform some fuzzy matching on high-profile and valuable target domains to include close derivatives. Sometimes this catches a false positive and prevents users from accessing a site they need. To handle these error cases, we also maintain a “whitelist,” which prevents a domain from being blocked as a fuzzy match. From Spyglass:

Multiple suspicious hosts were identified in connection with domain names on the whitelist. Two examples are shown here. Additional detail is included in our rough analysts notes. 1. The first domain name we examined, `smbr[.]rf[.]gd`, is connected to an IP address which was implicated publicly10 in the EtherDelta hack of December 2017. `rf[.]gd` is a subdomain service provided by `infinityfree[.]net` to users of its free web hosting service. Due to the free nature of the service, it has been abused by malicious actors in the past, likely by either standing up their own malicious website or compromising an unsuspecting subdomain. This specific host has been implicated in threat data collection systems throughout the internet and associated with reams of questionable domain names dating back to 2014. Another domain on the whitelist associated with this host is `fallin.rf[.]gd`. We recommend MetaMask: - Remove all domains from the whitelist. - If you cannot stop using the whitelist, remove all domains except those required to enable MetaMask functionality across partner sites.

Our response:

Our own investigation appears that the IP addresses highlighted as suspicious may have actually been shared infrastructure IP addresses, so we have no current reason to believe they are suspect.

Additionally, the whitelist is mostly used for un-blocking additional (similar) domains, and so if these domains were ever reported as malicious, we would expect to block them as soon as possible, like we would any new phishing site. Currently the whitelist almost exclusively exists to allow domains whose names fall within the Levenshtein distance of “fuzzylisted” domains.

Removing all domains from whitelist does not make sense for us, because it is used to automatically block additional sites. Sites on the whitelist do not have any additional privileges. They are merely non- blocked sites for whom we also automatically block similar domains.

We will review the whitelist and prune it to the minimum we can, but some sites need to be whitelisted for different reasons, especially since we are blocking with a Levenshtein distance, and so we do not see a blanket removal of sites from the whitelist as a practical step forward.

3. Whitelist process

The whitelist and blacklist are both maintained by a community of folks from around the blockchain ecosystem. Here’s Spyglass:

Individuals of uncertain origin or legitimacy can request a domain to be whitelisted. It appears the internal team methodology for assessing a domain as malicious or safe is the exclusive use of urlscan[.]info as evidenced by prior commits. This is an inadequate source of threat intelligence information. We assume the reason for implementing a whitelisting process is to mitigate issues created through use of the Levenshtein algorithm to identify potential phishing sites, which results in false positives which must be resolved manually once reported as an issue on Github. Improperly classifying a website could have other significant negative downstream repercussions for MetaMask users besides simple availability, as threat actors may seek to compromise whitelisted infrastructure to deliver targeted malware to MetaMask users. Attackers can host innocuous content while undergoing a whitelisting review and modify the site contents at a later date. Recommendation Only highly reputable websites operated by known, established business actors with no history of recent compromise and demonstrated commitment to cybersecurity should be whitelisted. Implement an industry standard process for deeper security- and intelligence-assessment of each requested domain, or outsource this function. The whitelist is a critical asset. As such, we recommend implementing raising visibility for the core team when commits are made to the whitelist on Github through alerting or some other process. Consider completely abandoning the whitelisting process by using a purpose-built system to process threat intelligence, including community submissions, which should be processed in a reliable, systematic, methodological manner consistent with industry best practices. We recommend evaluating Cisco/OpenDNS’s PhishTank API18 19 20.

Our response:

You mention that “threat actors may seek to compromise whitelisted infrastructure to deliver targeted malware to MetaMask users.” We’re interested to hear about exactly what you mean.

Whitelisted sites are not more available than other non-blacklisted sites. We do not list nor recommend them anywhere. Ultimately, our phishing detector heavily relies on fast responses to new threats, and no misconfiguration of the whitelist is going to prevent a block once an attack is known, nor is it going to make an attack more effective before it is.

We agree that we should minimize our use of the whitelist, and will be conducting a review/pruning of our current whitelist.

We are always looking to improve our security processes. We are actively hiring a chief security officer, who we hope to task with goals including optimizing our phishing detection process. If you know of anyone you’d recommend, please send them our way!

We will look into improving our visibility when the whitelist is modified.

We will investigate alternatives to our current phishing detection in the future.

We will look at PhishTank as a model for our phishing detection going forward.

4. Stale Indicator Risk

Domain names (threat indicators) which have been added to the white- or blacklists aren’t “aged” — meaning, there is no set expiration date at which point any given item will be reviewed or removed. Appropriate aging of indicators is a complex topic for which there is no set industry standard. However, experts in threat intelligence agree there is a point past which the information is no longer useful to serve the goal of protecting users, false positives increase, and returns continually diminish. Domains and infrastructure which may have been clean when added to the whitelist may become controlled by a threat actor at a later date. Recommendation - Implement a review process or automatic removal based on age. - Develop a process for aging of indicators based on source and confidence.

Our response:

We will look into automating an aging review process. We will consider weighing submission source when aging listed items.

MetaMask’s user notification when attempting to navigate to a blacklisted or suspected phishing site.

5. Blacklist Process and User Behavior

When a blacklisted or suspected phishing site is detected, users are directed to a live-hosted site as shown [in the image above] which does not allow the user to continue navigating to the intended domain in spite of the warning. Users do not have a way to circumvent this warning and there is no way to bypass a false positive other than submit a complaint to MetaMask — and wait — or disable the extension. As the complaint mechanism is based on Github and the majority of users will not have an account — especially as the use of crypto expands beyond enthusiasts — we posit this may incentivize the majority of users to either immediately disable the MetaMask extension to visit the desired domain or to register a new Github account to report the issue, thereby creating an inconvenience for the user and reducing the MetaMask team’s ability to properly screen the individual making the request. Recommendation - Consider modifying the extension to allow users to navigate to the domain in spite of the warning. - Consider providing a user-friendly false positive submission process to reduce the likelihood the extension will be disabled and user security will be reduced as a result.

Our response:

When users are unable to bypass the phishing page, they often come to us to complain, and at that time realize that the site was blocked for a reason. We will probably provide a mechanism to bypass the phishing warning in the future, but this is a low priority for us, because many users will click through without reading the warnings. (#4151)

We will be moving our report link to point at https://etherscamdb.info/report instead of GitHub, to increase user friendliness. It feeds the same lists we draw from. (#4774)

6. Live hosted phishing warning

When a blacklisted or suspected phishing site is detected, users are directed to a live-hosted site23 as shown in [the image above]. This has multiple second-order implications: Users attempting to visit a blocked site are directed to metamask[.]io, which creates a privacy leak for the user. However, this dataset can be a source of security data if intentionally collected. phishing.html now becomes an attractive target for threat actors to compromise. Recommendation - Modify the code to direct the user to a locally hosted, static phishing.html, or ensure the live-hosted version is subject to robust change monitoring which alerts the team. - Consider intentionally collecting self-reported data from users regarding behavior leading to blocked page event to more deeply understand the security context generating the majority of events.

Our response:

The phishing page is hosted via extension as of MetaMask v4.8.0 (#4893)

—

Conclusion

We’re committed as ever to providing a safe experience on the decentralized web. While it could be improved, our phishing detection has prevented many users from visiting sites that have been known to phish or otherwise harm. We look forward to iterating on our detection process as described here. If you have other thoughts or recommendations, we welcome your input.

And as always, thanks to the community members who support MetaMask and the decentralized ecosystem by devoting their time and energy to protect users across the world.