Update: PayPal was only used as an illustration and not a specific usable example. The basic Cyrillic keyboard does not contain a character that appears as a Latin "l" (and the character that exists in the extended set is apparently not useable when registering domains). ICANN informs us that rules preventing the mixing of Cyrillic and Latin in a domain name label have been in effect for gTLDs since 2005. These rules do not completely mitigate the potential for confusion, but they do make it less likely.

Back in October, we wrote about a new policy approved by the Internet Corporation for Assigned Names and Numbers (ICANN) that will allow non-Latin domain names to be registered in early to mid 2010. This is really exciting for Internet users in areas that use non-Latin alphabets (like Arabic, Japanese, Chinese and Cyrillic), who have spent the last fifteen years without full domain opportunities.

However, as the Times Online pointed out this week, this international progress also has some potentially disastrous opportunities for scammers and phishing sites. This is because of the characters that render the same way (despite different meanings) in different scripts. For instance, Cyrillic scripts, which is the basis for the Russian language, shares some of the same letterforms as the Latin alphabet. What this means is that potential evil-doers could register a domain using non-Latin characters that appears to spell out a Latin word.

Example: Real PayPal.com Versus Fake PayPal.com

The Times Online article uses PayPal — already a frequent phishing target — as an example.

If the domain, created using Cyrillic scripts "raural.com" was registered, the way that Unicode-browsers will actually render that domain in latin is as "paypal.com." In theory, phishers could pass around that link and set up a fake version of the PayPal site to harvest logins and credit card data.

I've made this graphic for even better illustration:

Pretty scary, no? As of right now, ICANN hasn't instituted any policies of trying to protect these kinds of situations, meaning it might be that much more difficult for even normally cautious users to avoid being scammed. Of course, a certain amount of the success of these scams is determined by how well different mail and browsing programs handle Unicode. However, most modern browsers and operating systems have strong Unicode support, which makes deciphering the differences that much more difficult.

Not all Latin letters are represented in Cyrillic, for instance, but for companies that can have their brand compromised, we hope they look at locking those domains up quickly.