As I was doing some reading on Unicode, I had to sign up for a free account with ft.com site in order to read one of their articles. I normally use strong passwords, but this Web site presented me with the following error message:

Your password must be at least 6 characters long and include letters and numbers only

Ignoring the bad user interface — please tell me before I typed the damned password — it's also suggestive of security issues (ask Bobby for one reason why programmers have such bad password restrictions).

And that got me to thinking about Å, also known as U+212B.

So the first thing you want to do is run this little program:

use charnames ':short'; # stop the "Wide character in print warnings" binmode STDOUT, ':encoding(UTF-8)'; print "\N{U+212B} \N{U+00C5} \N{U+0041}\N{U+030A}

";

And that will print Å Å Å .

Note: if your system is so broken it can't view the letters above, they're the upper case A with a combining ring above

Even though those characters are different code points, under Unicode, they must be considered to be the same character. The Unicode::Collate module demonstrates how this is done with the Unicode Collation Algorithm, even though Perl's built-in cmp operator gets this wrong.

This raises an interesting (to me) an interesting question. What should passwords do? If I allow Unicode in passwords, should I allow U+212B when the original password had U+00C5? Is this, in theory, restricting the number of possible combinations someone can type or is the password space so huge here that it really doesn't matter? Are there other security implications here that I should be aware of?