Everything You’ve Been Told About Data Retention Is Wrong

No, your online activity is not being monitored. Data retention does not work that way, goodnight.

Hello, my name is Lance E. McDonald. I spend most of my time on twitter yelling about computers, anime, and video games, but I actually get paid to create and implement software solutions at an internet provider in Australia. The most recent project I had to spend time on was a script that scrapes through account logs and archives the information required to meet the government’s new data retention laws. You’ve probably heard a lot about these laws in the news lately, and I’m guessing almost the entirety of what you heard has been clickbait-fuelled trash. I thought I’d show you what the data actually looks like, and talk about how this whole thing works.

I’m going to just open with a photo of my own personal data that has been retained by my internet provider over the past 10 days. I have accessed thousands of websites, downloaded about 30GB of data and basically used the internet like a typical five-person family would over a ten day period. I have manually rebooted my modem once in this time, while writing this piece. This is the entirety of the data that the government has access to about my usage over the past ten days:

(Disclosure: As well as removing identifying information, the IP address field and the “data volume” field have been removed from this screenshot. Data volume shows how much data measured in bytes I downloaded in the past 10 days; it was around 30GB).

Does this look a bit low detail compared to what you would expect? There’s nothing here other than “Lance turned on his modem, and then turned it off 10 days later” and the next line is “Lance turned his modem back on a few seconds later.” But this is actually what the attorney general’s guidelines describe the data as being expected to look like for internet providers. Data items should be “hours to several days, weeks, or longer apart”.

I’ve seen lists online with titles like, “Here’s what you need to do to avoid the new data retention legislation” consisting of VPN services, recommendations to use Tor, and a bunch of other arbitrarily selected pieces of advice that have zero impact on the data that is actually being retained. I’ve even seen a few anti-virus companies leveraging the public fear to try to sell some kind of encryption services. Perhaps if all the garbage being spread about your ISP recording what you do online were actually true, then sure, using a VPN would definitely hide that. But using a VPN to avoid the new data retention laws is like tinting your car windows to stop speed-cameras from hearing conversations inside your car: cars don’t work that way, conversations don’t work that way, and speed cameras don’t work that way; you’re not even close.

Recently we’ve been flooded by popular news reports making claims such as, “The government can tell you’ve been using Facebook Messenger, they just can’t read your conversations”. This simply isn’t true; your internet provider isn’t retaining anything about what services you use online as this isn’t part of the legislation. The ISP will only retain data about services they directly provide to you: they’re providing you a link to the internet, so they need to record the time that link was connected, and then the time it was disconnected. No data about what you’re using that link for is retained, no metadata, nothing.

The government’s new data retention laws require internet providers to remember, for two years, what IP address is assigned to a customer every time that customer’s modem is turned on, and what times that same IP address is released from the customer when their modem is turned off again. Also recorded is the location of whatever radio tower/telephone exchange/fibre node to which your modem is actually connected. If you’ve ever looked at a Telstra Detailed Bill, you can see how your internet sessions typically say what town you were in when you were using the internet on your phone. This shows the tower to which your phone’s modem was connected at the time.

The other aspects of telecommunication data retention revolve around telephone calls, SMS messaging, and email transmission. Not much is changing in regards to phone-calls and SMS; your provider will continue to keep a list of every phone number you call and how long you speak to those people, as has always been the case. The same goes for SMS, every time you send a message, the number to which you sent it is retained in a database. The only new requirement is that the data is now kept for two years. Previously it did not need to be retained, and providers only did so for billing purposes.

I will say this, though: email data retention is changing quite a lot, and is far more aggressive. The legislation hasn’t been completely clear on the matter, but it’s likely that it will be treated similarly to SMS, and every time you send an email your provider will record the transaction for two years, albeit discarding the body of the email. Please don’t use your internet provider’s email service if you have privacy concerns. Use Gmail or Outlook.com if you’re not using business class services already.

A huge part of the misconception about data retention equating to internet surveillance is the fact that the legislation requires that your telecommunications service provider retain data on “the destination of a communication”, and this is indeed one of the key data points being recorded by all service providers… except internet providers:

So, as is mentioned above, it’s worth taking a quick look at section 187A of the recently distributed Telecommunications (Interception and Access) Act 1979 where we can see that the intention has never been to perform surveillance.

The whole thing might bring to mind recent cases where end-users have downloaded copyrighted materials and the rights-holders have managed to subpoena customer information from the internet provider. How does this work? Well, rights-holders tend to hang out in public torrent swarms watching people seeding their intellectual property, and they take note of every IP address engaging in the illegal activity. Then they send annoying emails to the ISP who owns those IP addresses, insisting they forward email warnings to their customers.

Most ISPs put these emails in the trash, the logic being that if the rights-holder wants legal action they should be speaking to the police, not an internet provider. The rights-holders aren’t approaching internet providers and saying, “Tell us everyone who pirated our movie”, because the internet provider doesn’t retain data about what their customers do online; they’re saying, “We saw these people pirating our movie and we want you to tell them to stop.” As has always been the case, if you’re seen breaking the law, you’ll probably be identified. If you break the law but no one sees it happen, data retention won’t help anyone catch you (the moral grounds for pirating Game of Thrones are obviously a whole different kettle of fish).

Eventually, one rights-holder, someone to do with the movie Dallas Buyers Club, got sick of internet providers throwing their emails in the trash and took the providers to court. The court decided that, in this case, the rights-holder should be allowed to speak to the customers directly.

In the end, nothing much came of it. Things might be changing on this matter in the near future as providers will likely soon be required to send customer details directly to the rights-holder on a 3-strike system so the rights-holder can send a scary email directly to the customer. This is an unrelated legislation, though. And besides, you probably use private trackers anyway, don’t you?



So what’s the point of this data that’s being retained? Does it have anything to do with terrorism? Probably not. In my experience, the data is only used in child pornography cases. Typically the process goes that the police will raid an illegal pornography server and get physical access to the machine. Inside the machine, they find a list of every IP address that has ever connected to it, thus they have a list of every IP address that committed the crime of accessing that pornography server. The police contact the internet providers that own those IP addresses, and the internet providers look in their data retention logs to see which customers were assigned those IP addresses at those times. The internet provider then hands that list of customers to the police.

This actually happens, and has been happening for years. Most internet providers have already been retaining this data the whole time.

You might have heard that a number of internet providers have been granted an 18-month extension on their data retention obligations. This is typically due to the bureaucratic process more than anything else. The majority of internet providers already met their data retention obligations years ago, and now we’re just seeing the government finally put a strict rule set on exactly how this is meant to be done.

It can be very exciting to imagine that the world works in a way where the government is some malevolent, all-powerful force capable of seeing and attempting to control what you do. But the internet is still primarily outside the government’s reach, despite what rival political parties will pin on each other or what the media will say to trick you into clicking on their ads. Even your provider doesn’t have the technology to control what you do with the internet. When was that internet filter coming, again? Was it six months ago, or seven years ago? There’s been a few now, hasn’t there? The government doesn’t understand the internet and is doing enough terrible things every day that we don’t have to make up any extra stuff.

And please stop saying “metadata”, this isn’t CSI: Cyber.

You can follow Lance E. McDonald on Twitter here.