Homograph attacks are a decade-old problem. Last time it made headlines was just last week with the attack on Binanace Exchange and before that when security enthusiast, Xudong Zheng published a vulnerability in the way modern browsers handle domain names.

In this blog post, I’ll do a quick catch-up on what a homograph attack is and present a proof-of-concept for a website impersonating coinbase.com. Also, we’ll present a spotted in-the-wild website impersonating myetherwallet.com with the same technique.

The affected browsers, at the time of writing, are the latest Firefox (58) and Chrome (64). Important Note: This is not in any way a vulnerability; this is a decade-old design problem of the way internationalized DNS works, and how browsers handle them. We will present some advice that may work for some people and organizations, and we will also explain why we believe this is an excellent example of an AI outperforming humans.

What is a Homograph attack?

Homograph attack is a method of impersonating another website by registering a similar domain name which consists of non-ASCII characters that are similar to English characters visually but are actually different.

For example, in this post, we will compare a site we created called coinḃase.com to coinbase.com. You can spot already that it is not trivial to find the difference, especially when presented in the browser URL bar where the font is usually difficult to read (this particular attack on coinbase is much more effective on a desktop then on a mobile device).

How browsers deal with it?

The browsers deal with it partially by showing a Punycode representation, which is an ASCII-only representation of a Unicode name. For example, coinḃase.com in Punycode is xn--coinase-8l3c.com. Of course, in this case, if the browser shows the second representation, the user will notice that it is not the real website. The problem lies with the rules and heuristics with which the browser decides whether to display the unicode or punycode representation. Here, you can see the long list of rules and heuristics Chrome browser describes. Of course, this can’t solve all the possible cases, and a lot of instances are not handled by mistake or by design. (Though even after reading the entire documentation, I can’t adequately predict which representation the browser will choose).

Proof-of-concept?

We took an example site, coinebase.com, which is one of largest crypto exchange platforms. We then went to the Latin script alphabet page on Wikipedia and chose a letter that is the most similar to one of the letters in coinbase.com. We selected ‘ḃ’ because, in the browser bar (on desktop), the dot on top of the b is connected to the “b.” We bought the domain for a few dollars (it was available, of course) on GoDaddy and hosted the static website on Google Cloud. It costed only a few bucks, which is a really high ROI for attackers even if they succeed in stealing only 1 BTC (but they probably achieve a lot more).

Here is a link and a screenshot to our fake coinḃase.com site.

And here is a link and a screenshot to the official one

As you can see, it is tough for a human to spot that they are not the same website. Of course, we didn’t copy the coinbase site as we are not in the business of phishing; we just put some text for the demonstration.

In the Wild Attack

We spotted the same attack in the wild targeting the users of “myetherwallet”.

Here is a link and a screenshot to the official website https://www.myetherwallet.com/

Here is a link and a screenshot to the fake website https://myetherwaḻḻet.com, where the l’s are replaced with U+1E3B, an l with an underline(The site is already down and blacklisted by Google Safe Browse because users reported it).

As you can see here as well, it is tough for a human to spot that they are not the same website.

Who is affected?

At the time of writing, we tested the website on:

latest Chrome (64) – affected.

Latest Firefox (58) – affected.

Latest Edge (41) – Partially affected.Internet Explorer always shows the Punycode representation unless the language used in the URL is also used in the system. So, non-English users might be affected.

What can you do?

Settings

Chrome Users – Not Much.

Firefox Users – If you don’t mind not seeing Unicode representation for sites, we recommend going to about: config and changing the IDN_show_punycode setting. This will cause the browser to always show the punycode representation, and mitigate IDN homograph attacks.

Bookmarks

You can create bookmarks for sensitive sites for you and your organization. Advise users visit these sites only through the bookmarks. It may be a burden but this is a very secure way.

None of those solutions is bulletproof, but they might help.

Summary

On a self-promotional note: This is another excellent example of where AI and Computer Vision technology outperforms humans in detecting fraudulent websites. A computer program detects if domains are different with a simple comparison instruction, but for a human, it may be difficult or even impossible in some situations.

Sign-up for our two-weeks free trial of PhishProtectTM.