The holidays open up a block of time to catch up on "I meant to read that" bookmarks, RSS feeds, and all the favorited and forgotten tweets. I made it through 50 before a NormanShark blog post kicked off a research project.

The analysts found a malware sample which was using .bit domains in their communications infrastructure, but .bit ... what is that? .bit is a TLD operating outside of ICANN. Some would say they are TLD squatting, but I leave that opinion up to the reader.

To catch up on .bit and TLDs being used outside of ICANN's approval, the "Special-Use Domain Names of Peer-to-Peer Systems" memo is a good reference. The memo covers "six peer-to-peer domains that (given their decentralized design) do not require a central authority to register names ".gnu", ".zkey", ".onion", ".exit", ".i2p", and ".bit."" One could say they are sticking it to the man, where in this case the man is ICANN.

The .bit TLD is the decentralized peer to peer namespace for Namecoin. To access domains in this space you need to provide your OS configuration details so they can find a DNS Server which is hosting the .bit zone. You can also pass your requests through a proxy, etc.

Having read a bit about what Namecoin is and .bit, I started to wonder, how can I create the .bit zone file? What interfaces are available for querying the state of domains in the name space? The only way to know is to get your hands dirty so I spun up an Ubuntu 13.10 small instance in Amazon US East.

I would have gone for something a bit more beefy, but it being on my personal credit card, I wanted to keep a spot price below $0.04 / hr. #Scrooge. I've read many an article about the cost efficiency of mining currencies in AWS, the amount of work a small instance can do isn't going to be generating Namecoin anytime soon (CPU mining is so out).

For now we will hold off on delving into the details of the proof of work computation and focus solely on the Namecoin DNS data.

So here's how it all went down:

Install dependencies Git clone https://github.com/namecoin/namecoin.git Populate the config file Start the Namecoin daemon

Once the daemon is started it will make a local copy/update an existing local copy of the block chain. The daemon also has an interface for querying the Namecoin block chain.

But what is a block chain? "A block chain is a transaction database shared by all nodes participating in a system based on the Bitcoin protocol. A full copy of a currency's block chain contains every transaction ever executed."

In the case of Namecoin, it includes all the domain registrations, record additions, modifications, etc. Therefore, the Namecoin block chain provides a transactional history for the Namecoin namespace! You can use this block chain explorer site to take a look. Each block has an ID and is made up of transactions and name operations (adding domains, records, etc.). The block chain is the history of blocks.

As my local copy of the block chain updated, I went back to thinking about the easiest way to explore the .bit zone. I started looking into scripting an interaction with the namecoind block chain (BerkleyDB currently, but a migration to LevelDB is in the works to keep in pace with Bitcoin), but then stumbled across a project on GitHub where someone had already done most of the work.

It was a collection of PHP scripts which interface with a local instance of namecoind and generate a BIND zone file. To remove the hard part and get right to some data I used these scripts to generate the zone file. One thing to keep in mind is I haven't taken a look through the NamecoinToBind code so the high level analysis is based on a sample generated by the NamecoinToBind scripts.

Based on the IPs used in A records, what is the geographic distribution of the .bit name space?

260 Distinct Autonomous Systems are represented

427 Netblocks

532 IP addresses

14317 DNS Records of Interest ( A / AAAA records )

Are there certain IPs or Netblocks which have a dominating presence? You betcha!

Count IP Address used in the A Record

1008 178.63.16.21

4570 212.232.51.96

4980 91.250.85.116

When looking at the records related to 212.232.51.96 and 91.250.85.116, they both appear to have registered a large collection of popular words and digits, as well as two and three letter combinations.

Another pattern they have in common is the wildcard is registered for each zone as well. If I had to guess, these represent some of the early adopters of the system who are taking advantage of the namespace gold rush. Another interesting pattern that pops up is a number of domains have been registered which match the pattern below.

xn--IN A *.xn--

IN A

What is this weird pattern you might ask? PUNYCode! PUNYCode converts the utf-8 charset to ascii since domain names can only be ascii characters. So it looks like both of these groups have the same strategy to also buy up popular PUNYCode domains names… very clever. (RFC for PUNYCode) 178.63.16.21 falls into a the same bucket in the sense that the domains registered fit a similar pattern of domains from 2–12 characters.

Another interesting insight might be a breakdown of the countries by distinct AS involved in the namespace.

What would be interesting to know from a DNS Operations perspective? Count of record types?

23 AAAA

49 CNAME

188 DNAME

10889 NS

14290 A

This leads to an interesting comparison of A vs. AAAA record usage ( v4 vs. v6 ).

Only 23 AAAA records… 5 of them are associated to the same FQDN so really ... 18 distinct IPv6 addresses

Number of Wildcard records?

Count WildCard Record Type

6412 A

7 CNAME

4 AAAA

Then (selfishly), are there any relationships between Dyn and the current Namecoin domains? Of course!