Edit: I'm putting this up front as a lot of people are asking for it - the hearing will be live-streamed on YouTube and there's already an embedded video on the hearing page.

There's a title I never expected to write! But it's exactly what it sounds like and on Thursday next week, I'll be up in front of US congress on the other side of the world testifying about the impact of data breaches. It's an amazing opportunity to influence decision makers at the highest levels of government and frankly, I don't want to stuff it up which is why I'm asking the question - what should I say?

For a bit more context, I've been chatting with folks from the House Energy and Commerce Committee for a while now about the mechanics of data breaches. Obviously, the work I've been doing with Have I Been Pwned (HIBP) has given me a heap of insight into this specific area of infosec over the last 4 years and the folks from DC felt my views on things might be helpful. That was all great and I was happy to share my thoughts from the other side of the world. Then, a few weeks ago, they reached out and per the title of this post, said "Hey, how about testifying in front of congress in Washington DC?" Uh, you know I'm Australian, right? And yes, they knew of my funny accent and the fact that I was on the other side of the world but now, here we are.

The hearing is actually to look at the current challenges facing identity verification and obviously the prevalence of data breaches is having a pretty serious impact on that. For example, if your bank verifies that you are indeed who you say you are by asking you for your date of birth yet that's appeared in a data breach, how sound is it as a knowledge-based authentication (KBA) attribute? That might have worked years ago when we had these small, isolated silos of information that were relatively self-contained, but fast forward to "The Age of the Data Breach" and things are rather different.

The mechanics of the hearing essentially work like this:

I'll submit a written testimony that's a few thousand words long a couple of days before the hearing On the day of the hearing, I'll read a 5-minute oral testimony Two other expert witnesses (specialising in other areas of identity verification) will also read 5-minute testimonies Congressmen and congresswomen will ask us questions relating to our testimonies for a couple of hours after that

All of this process is completely transparent and open to the public: my written testimony will be made available publicly 48 hours before the hearing (I'll re-post it on this blog too), the whole things will be broadcast live on the day and members of the public can attend in person if they wish (ping me if you're in the area and plan on coming). It'll also all be recorded and available for viewing later on. If you're interested in what a congressional hearing looks like, check out this hearing on the opioid crisis (deep link to when expert witnesses are introduced). There's a lot of folks in suits, which is just one of many things I'm having to adapt to given my existing professional attire consists entirely of jeans and black t-shirts!

I've already drafted up both my written statement and oral testimony but since I have a few days before I need to submit anything, I really wanted to reach out to the community and share a very broad overview of what I have planned. Many of the people who read this blog have had first-hand experience with data breaches themselves either by having their personal info exposed, working for a company that's been breached, sending me data they've seen circulating or in some cases even being, well, let's call them "person 0" to have seen the breach. I'd love to know what you think is important for the folks in Washington to hear, keeping in mind this all has to be pretty high level. Let me share a broad overview of my key points (most of which you'll have seen me comment on before), then I'd love your comments:

Data breach vectors: There's malicious hacking which people most frequently think of, but there's also the growing prevalence of exposed DBs and backups. The former is frequently due to well-known vulnerabilities and sloppy coding, the latter is usually misconfigured environments. Data breaches can take years to discover: Particularly in 2016 and 2017, we've seen incidents from many years earlier suddenly emerging. Extending that observation, we simply have no idea how many other incidents have already occurred and are yet to come to light. We've created a perfect storm of data exposure: Easily accessible cloud services at a very low price point combined with an explosion in the number of online services collecting data plus the emergence of IoT thrown in as well means a rapidly increasing attack surface. Data maximisation is the norm: Services want to collect as much data as they can from users for all sorts of reasons beyond what's necessary for the function of the service. It's also data that doesn't get purged; sign up to a forum now and give them your date of birth (because you know, they might want to send you a birthday message) and that data will still be there a decade from now. Breaches are really extensively redistributed: It still blows me away how rampant this is. Obviously, there's distribution for commercial purposes (BTC in exchange for data), but there's also an enormously active trading scene where people (often kids), are just swapping data. The immutability of exposed data attributes: The problem with KBA is the assumption that knowledge alone can be used for verification. When we're talking about "static" KBA (that is knowledge that doesn't change), when your mother's maiden name is leaked you're going to have a hard time of it because you can't change that like you can a password. The irrevocability of exposed data: This speaks to the question I so frequently hear: "Can you remove my data from the internet?" No, it's near on impossible and once that data starts spreading, the data breach genie never goes back into the bottle. The power of OSINT data: I want to make a really firm point here that whilst data breaches are terrible and they're leaking a lot of our info, we've also become pretty good at doing it ourselves. Open source intelligence data is all over the place, especially due to social media and the very attributes we so frequently share are the ones being used for KBA ("Happy birthda... ah crap"). Aggregation of multiple sources compounds the problem: When I look at data loaded into HIBP, I frequently see multiple different data points on the same individual exposed in different breaches. When you aggregate these together, you get a much richer data set on the individual, especially once combined with that OSINT data as well. The resultant impact on KBA: All of the above culminates in the feasibility of KBA no longer being what it once was. What worked in the 90's simply doesn't translate to an era in which so much of our data is exposed so extensively.

Incidentally, I've decided not to mention specific data breaches but rather to focus on the patterns we're seeing in the industry. If I'm asked for examples then there's certainly no shortage to choose from, but I felt it was better to focus on the patterns rather than specific organisations' shortcomings.

So that's what's been going on in my world and come next week, I'll be sitting there on the other side of the world in the most formal environment I've ever been in talking to very important people and saying "pwned" a lot. It's an amazing opportunity to position infosec and data breaches in front of lawmakers and it's one I want to make the most of. Please do share your thoughts in the comments below and help me ensure the right issues get the airtime they deserve.