As you may know, a "full" dump of email addresses and password hashes for the Linkedin.com attack that occured in 2012 has become available. Here at KoreLogic, we got our hands on the list of emails and the separate list of passwords (but nothing linking the two together, which we don't want or need). We started to gather some statistics on them using our Password Recovery Service (PRS). The following analysis assumes the lists are real; due to the valid email addresses and confirming some of our own accounts' data from back then, we believe that the dump is real.

What we know so far:



It contains 164,590,819 unique email addresses.



It contains 177,500,189 unsalted SHA1 password hashes. Note that this is a larger number than the amount of email addresses.

It contains 61,829,207 unique hashes. This means there are duplicates, and this is good for password researchers because it allows us to come up with statistics of how often certain passwords are used.

As of Thursday May 19 14:09 EDT 2016, we've cracked 65% of the lists, after about two hours work on our private distributed cracking grid. Approximately 41,500,000 plain-text hashes have been recovered so far. There are literally thousands of new cracks coming in every minute, so the numbers are a bit rough.

The most common password hashes are:

Number | Hash 1135936 7c4a8d09ca3762af61e59520943dc26494f8941b 207488 7728240c80b6bfd450849405e8500d6d207783b6 188380 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8 149916 f7c3bc1d808e04732adf679965ccc34ca7ae3441 95854 7c222fb2927d828af22f592134e8932480637c0d 85515 3d4f2bf07dc1be38b20cd6e46949a1071f9d0e3d 75780 20eabe5d64b0e216796e834f52d61fd0b70332fc 51969 dd5fef9c1c1da1394d6d34b248c51be2ad740840 51870 b1b3773a05c0ed0176787a4f1574ff0075f7521e 51535 8d6e34f987851aa599257d3831a1af040886842f 49235 c984aed014aec7623a54f0591da07a85fd4b762d 41449 6367c48dd193d56ea7b0baad25b19455e529f5ee 35919 d8cd10b920dcbdb5163ca0185e402357bc27c265 34440 1411678a0b9e25ee2f7c8b2f7ac92b6a74b3f9c5 32879 601f1889667efaebb33b8c12572835da3f027f78 32289 ff539c96a2ed9f72a47a5e1c7d59e143ba1fba94 30972 019db0bfd5f85951cb46e4452e9642858c004155 30923 01b307acba4f54f55aafc33bb06bbbf6ca803e9a 28928 775bb961b81da1ca49217a48e533c832c337154a 28705 17b9e1c64588c7fa6419b4d29dc1f4426279ba01

These values crack to:

Number | Hash | Plaintext 1135936 7c4a8d09ca3762af61e59520943dc26494f8941b 123456 207488 7728240c80b6bfd450849405e8500d6d207783b6 linkedin 188380 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8 password 149916 f7c3bc1d808e04732adf679965ccc34ca7ae3441 123456789 95854 7c222fb2927d828af22f592134e8932480637c0d 12345678 85515 3d4f2bf07dc1be38b20cd6e46949a1071f9d0e3d 111111 75780 20eabe5d64b0e216796e834f52d61fd0b70332fc 1234567 51969 dd5fef9c1c1da1394d6d34b248c51be2ad740840 654321 51870 b1b3773a05c0ed0176787a4f1574ff0075f7521e qwerty 51535 8d6e34f987851aa599257d3831a1af040886842f sunshine 49235 c984aed014aec7623a54f0591da07a85fd4b762d 000000 41449 6367c48dd193d56ea7b0baad25b19455e529f5ee abc123 35919 d8cd10b920dcbdb5163ca0185e402357bc27c265 charlie 34440 1411678a0b9e25ee2f7c8b2f7ac92b6a74b3f9c5 666666 32879 601f1889667efaebb33b8c12572835da3f027f78 123123 32289 ff539c96a2ed9f72a47a5e1c7d59e143ba1fba94 linked 30972 019db0bfd5f85951cb46e4452e9642858c004155 maggie 30923 01b307acba4f54f55aafc33bb06bbbf6ca803e9a 1234567890 28928 775bb961b81da1ca49217a48e533c832c337154a princess 28705 17b9e1c64588c7fa6419b4d29dc1f4426279ba01 michael

The most common patterns used in the passwords are follows: (Updated May 20 11:00 EDT 2016)

?d = Digit [0-9]

?s = "Special Character" +_)*(&^%$#@!~`-=[]\{}|;':",./<>? ...etc.

?l = Lower case letter [a-z]

?u = Upper case letter [A-Z]



Number | Pattern 2464707 ?l?l?l?l?l?l?l?l Example: linkedin 1776416 ?l?l?l?l?l?l?d?d Example: linked12 1663330 ?l?l?l?l?l?l?l?l?l Example: alinkedin 1587423 ?l?l?l?l?d?d?d?d Example: link2012 1528434 ?l?l?l?l?l?l?l Example: linkedi 1525784 ?l?l?l?l?l?l Example: linked 1348195 ?d?d?d?d?d?d?d?d 1172612 ?l?l?l?l?l?l?l?l?l?l 1074096 ?l?l?l?l?l?d?d?d?d 1042003 ?d?d?d?d?d?d?d?d?d?d 984939 ?l?l?l?l?l?l?d?d?d?d 936771 ?l?l?l?l?l?l?l?d?d 819341 ?l?l?l?d?d?d?d 781166 ?d?d?d?d?d?d?d 723656 ?l?l?l?l?l?d?d 713165 ?l?l?l?l?l?l?l?l?l?l?l 692280 ?l?l?l?l?l?d?d?d 690521 ?d?d?d?d?d?d 670878 ?l?l?l?l?l?l?l?l?d?d 653118 ?l?l?l?l?l?l?l?d 539001 ?l?l?l?l?l?l?d?d?d 494526 ?l?l?l?l?d?d 491474 ?l?l?d?d?d?d 462250 ?l?l?l?l?l?l?l?l?l?l?l?l

The most common "base words" used in the passwords are shown below. These are calculated by taking all the recovered passwords, removing all special characters and digits, and then sorting the results. This was the initial technique used by KoreLogic in 2012 to determine that the set of ~6.5 million hashes found on a Russian message board was in fact from LinkedIn.com (which now appears to have been only a subset of this larger leak).

Number | Base word 29883 linkedin Examples: linkedin1 linkedin2012 linkedin! 26194 link Examples: link2012 2012link !!link!! 21731 love 19721 ever 15574 linked 14156 life 11674 alex 10773 mike 10566 pass 9540 john 9176 blue 8937 june 8338 jack 8006 july 7305 home 7205 star 7094 password 7005 angel

Update: May 19 15:53 EDT 2016



Here is a list of the most common domains used by the accounts in the dump. No real surprises here.

Number | Domain Name 32865035 gmail.com 24018467 hotmail.com 20361246 yahoo.com 4268015 aol.com 1977483 comcast.net 1427168 yahoo.co.in 1333354 msn.com 1039135 sbcglobal.net 1036522 rediffmail.com 992936 yahoo.fr 913406 yahoo.co.uk 843158 live.com 839735 yahoo.com.br 748001 hotmail.co.uk 740473 verizon.net 574117 hotmail.fr 549022 yahoo.com 528635 ymail.com 528040 cox.net 509047 bellsouth.net 503271 libero.it 478587 att.net 428930 yahoo.es 406492 btinternet.com

Update: May 19 17:00 EDT 2016



42,691,862 unique passwords recovered so far; 69% of the unique hashes have cracked at this point.

Of the total 177,500,189 non-unique hashes leaked, there are 143,914,964 password hashes cracked, 33,585,225 left. That represents 81.07% of all LinkedIn.com users in the dump.

Update: May 20 10:00 EDT 2016



~48,520,000 unique passwords recovered so far; ~78% of the unique hashes have cracked at this point. And we have recovered the passwords for ~86% of all LinkedIn.com users in the dump.

~13,360,000 unique hashes left to crack ...

Update: May 20 11:00 EDT 2016



Here is a list of the most common email addresses without their domain. No real surprises here.

555249 info@ 64325 john@ 60845 david@ 55525 mike@ 52685 chris@ 52251 mail@ 50654 sales@ 50444 mark@ 48006 steve@ 45872 paul@ 39051 contact@ 37424 linkedin@ 36511 peter@ 35818 michael@ 35770 admin@ 30473 dave@ 30034 tom@ 29102 jim@ 26872 jeff@

Update: May 20 18:00 EDT 2016



Our grid was busy doing client work for about 24 hours, so not many new cracks today. But here's some updated stats and analysis.

~49,290,000 unique passwords recovered so far.

~12,520,000 unique hashes left to crack.

5,184,351 of the recovered passwords are 8+ characters and contain one upper, one lower, and one digit.

825,975 of the recovered passwords are 8+ characters and contain one upper, one lower, and one digit and one special character.

The pattern distribution of these passwords closely resembles the findings of our PathWell research - they are heavily biased towards some universally common topologies:

29742 ?u?l?l?l?l?l?s?d?d 26640 ?u?l?l?l?l?l?d?d?s 26287 ?u?l?l?l?l?s?d?d 23830 ?u?l?l?l?l?l?s?d 20296 ?u?l?l?l?l?l?d?s 18365 ?u?l?l?l?l?d?d?s 17390 ?u?l?l?l?s?d?d?d?d 17085 ?u?l?l?l?l?l?l?d?s 16723 ?u?l?l?l?l?l?l?s?d 14989 ?u?l?l?l?l?l?l?s?d?d 13565 ?u?l?l?l?l?s?d?d?d?d 12986 ?u?l?l?l?l?l?l?d?d?s 12590 ?u?l?l?s?d?d?d?d 12305 ?u?l?l?l?l?s?d?d?d 11280 ?u?l?l?l?l?l?l?l?d?s 10991 ?u?l?l?l?d?d?d?d?s 10822 ?u?l?l?l?l?l?s?d?d?d?d 10796 ?u?l?l?l?s?d?d?d

The PACK output of the unqiue cracks so far (numbers rounded slightly):

[*] Length: [+] 8: 29% (14,620,000) [+] 9: 17% (8,430,000) [+] 10: 14% (6,950,000) [+] 7: 13% (6,660,000) [+] 6: 10% (5,410,000) [+] 11: 06% (3,270,000) [+] 12: 03% (1,930,000) [+] 13: 01% (921,000) [+] 14: 01% (508,000) [+] 15: 00% (263,000) [+] 16: 00% (159,000) [*] Character-set: [+] loweralphanum: 48% (24,128,000) [+] loweralpha: 20% (10,303,000) [+] mixedalphanum: 10% (5,026,000) [+] numeric: 08% (4,428,000) [+] loweralphaspecialnum: 02% (1,377,000) [+] upperalphanum: 01% (957,000) [+] all: 01% (936,000) [+] mixedalpha: 01% (852,000) [+] loweralphaspecial: 01% (507,000) [+] upperalpha: 00% (431,000) [+] mixedalphaspecial: 00% (147,000) [+] specialnum: 00% (84,000) [+] upperalphaspecialnum: 00% (62,000) [+] upperalphaspecial: 00% (19,000)

Update: May 21 18:00 EDT 2016



~49,999,999 unique passwords recovered so far.

~11,863,000 unique hashes left to crack.

Update: May 25 15:40 EDT 2016



Our grid is mostly doing other things now. We have gotten a couple requests about re-sharing the list, and/or about building some kind of online interface to look up individual credentials. We have no plans to do so.

For more of KoreLogic's talks about password recovery, check out the following videos of KoreLogic employee, and founder of PRS, Rick Redman:

Your Password Complexity Requirements are Worthless - OWASP AppSecUSA 2014

Cracking Corporate Passwords: Why Your Password Policy Sucks

