We still get the same content back! (You can also drop the host header to prove the point that 01rgQ3jqo7L.css definitely isn't on Atlassian's site—you'll get a 404 instead.)

So that's our definitive test for confirming that a domain can be used for domain fronting via CloudFront. We could use this curl command on a long list of domains and see which are frontable, but that will be relatively slow. Especially if we want to check a big list—like the Alexa top million domains...

Method #1

Instead of large scans, we can use some things we know about CloudFront and domain fronting to come up with a way to find frontable domains offline.

CloudFront domains are going to ultimately resolve to a CloudFront edge server IP address. CloudFront edge server IP addresses. We only want to front through HTTPS (otherwise the all-important host header will be visible). Most CloudFront implementations are going to support HTTPS anyways.

Now all we need is a list of domains with IPs that support HTTPS connections... Thankfully censys.io has us covered! Here is a list of domains in the Alexa top million that support TLS on port 443 (I used the alexa-results.csv.lz4 file). CloudFront will always support TLS so that works.

Now we just need to compare the IP of each domain with the CloudFront edge server ranges. I wrote a quick Python script to do just that. It's not the most optimized thing, so it took about 5 minutes to run for me.

Not bad, I got 4540 domains!

Note: These aren't guaranteed to all work for domain fronting, and if you read on, I get to confirming which ones work at the end. (Spoiler, it's more than 99%...)

Method #2

Okay, so the top million gives you a lot of solid domain name choices. But if we could have more... We're going to get more!

This time we're going to key off of a different indicator of CloudFront: CNAME records! When you set up a domain to use CloudFront, you need to add a CNAME record to redirect DNS queries to CloudFront (apparently unless you're using Amazon's Route 53 DNS, but we'll ignore this).

We're going to use a monster of a data set to find all of these—Rapid7's Forward DNS which "contains the responses to DNS requests for all forward DNS names known by Rapid7's Project Sonar". We'll only need the 'A' record version here, which is about 14 GB compressed. And 137 GB uncompressed... Fun!

Parsing is easy, but well, it takes a sec to grep through 1.47 billion lines.