Much has been made of a recent Facebook "leak" which allegedly disclosed information on over 100 million Facebook users. What some reports have failed to highlight, however, is that the information was already public to begin with.

Security researcher Ron Bowes wrote a Ruby script that downloads information from Facebook's user directory, a searchable index of public profile pages. The directory does not expose a user's entire profile and only exposes information that the user has allowed Facebook to make public. This includes names, profile images, and small sampling of the user's friends. Users can opt out of inclusion in the search, but could potentially still appear on the directory page of a friend who is searchable.

Bowes got the idea of spidering the data so that he could collect statistics about the most common names. Such statistical information isn't sensitive at all and doesn't pose any security threat to Facebook users. The data could be useful, however, for building automated account cracking software that is generic and not specific to Facebook. This is because a list of the most common names can be used to assemble a good dictionary of potentially popular usernames for use in brute-force tools that attempt to identify and crack user accounts.

There are a number of other public data sources that are commonly culled to obtain the same kind of statistical information for security research purposes. One example is the Social Security Administration's index of popular baby names. What makes the Facebook data particularly good is that it's a global index of first and last name pairs. By putting together the first initial and last name of the users and analyzing the frequency of the output, Bowes constructed what he believes to be a compelling list of most common potential usernames:

129,369 jsmith

79,365 ssmith

77,713 skhan

75,561 msmith

74,575 skumar

72,467 csmith

71,791 asmith

67,786 jjohnson

66,693 dsmith

66,431 akhan

Bowes wanted to contribute the data to the Ncrack project, which is building an open source tool that makes it easy to test a system's susceptibility to brute-force login attacks. He realized that there might be broader interest in the data set among security researchers, so he decided to put it in a torrent and make it available to everyone. He also hoped that it would help raise awareness among regular users of the fact that Facebook makes basic user information available through its directory.

This incident doesn't represent a breach of Facebook's security, because the information is made public by design. It highlights, however, the importance of keeping an eye on your social networking privacy settings and understanding how your personal information is used. Many users might not realize that their names and photos are accessible in Facebook's public user directory.