How does coffee shop Internet access work?

You pull out your laptop and type http://www.google.com into the URL bar on your browser. Instead of your friendly search box, you get a page where you pay money or maybe watch an advertisement, agree to some terms of service, and are only then free to browse the web.

What is going on behind the scenes to give the coffee shop that kind of control over your packets? Let's trace an example of that process from first broadcast to last redirect and find out.

Step 1: Get our network configuration

When I first sit down and turn on my laptop, it needs to get some network information and join a wireless network.

My laptop is configured to use DHCP to request network configuration information and an IP address from a DHCP server in its Layer 2 broadcast domain.

This laptop happens to use the DCHP client dhclient . /etc/dhcp3/dhclient.conf is a sample dhclient configuration file describing among other things what the client will request from a DHCP server (your network manager might frob that configuration -- on my Ubuntu laptop, NetworkManager keeps a modified config at /var/run/nm-dhclient-wlan0.conf ).

A DHCP negotiation happens in 4 parts:

Step 1: DHCP discovery. The DHCP client (us, 0.0.0.0 in the screencap) sends a message to Ethernet broadcast address ff:ff:ff:ff:ff:ff to discover DHCP servers (Wireshark shows IP addresses in the summary view, so we see broadcast IP address 255.255.255.255 ). The packet includes a parameter request list with the parameters in the dhclient config file. The parameters in my /var/run/nm-dhclient-wlan0.conf are:

subnet-mask, broadcast-address, time-offset, routers, domain-name, domain-name-servers, domain-search, host-name, netbios-name-servers, netbios-scope, interface-mtu, rfc3442-classless-static-routes, ntp-servers;

00:90:fb:17:ca:4e

192.168.5.1

DHCP servers that get the discovery broadcast allocate an IP address and respond with a DHCP broadcast containing that IP address and other lease information. This is typically a simple race -- whoever gets an offer packet to the requester first wins. In our case, only MAC address (Wireshark shows IP address) answers our discovery broadcast.

Step 3: DHCP request. The DHCP client picks an offer and sends another DHCP broadcast, informing the DHCP servers of the winner and letting the losers de-allocate their reserved IP addresses.

Step 4: DHCP acknowledgment. The winning DHCP server acknowledges completion of the DHCP exchange and reiterates the DHCP lease parameters. We now have an IP address ( 192.168.5.87 ) and know the IP address of our gateway router ( 192.168.5.1 ):

Step 2: Find our gateway

We managed to get a lot done using broadcast packets, but at this point a) nobody in our broadcast domain knows our MAC address, and b) we don't know the MAC address of our gateway, so we can't get any packets routed out to the Internet. Let's fix that:

Before offering us IP address 192.168.5.87 , the DHCP server ( Portwell_17:ca:4 ) sends an ARP request for that address, saying "Who has 192.168.5.87 . If that's you, respond with your MAC address". Since nobody answers, the server can be fairly confident that the IP address is not already in use.

After getting assigned IP address 192.168.5.87 , we ( Apple_8f:95:3f ) double-check that nobody else is using it with a few ARP requests that nobody answers. We then send a few gratuitous ARPs to let everyone know that it's ours and they should update their ARP caches.

We then make an ARP request for the MAC address corresponding to the IP address for our gateway router: 192.168.5.1 . Our DHCP server happens to also be our gateway router and responds claiming the address.

Step 3: Get past the terms of service

Now that we have an IP address and know the IP address of our gateway, we should be able to send packets through our gateway and out to the Internet.

I type http://www.google.com into my browser's URL bar. There is no period at the end, so the local DNS resolver can't tell if this is a fully-qualified domain name. This is what happens:

Looking back at the DHCP acknowledgement, as part of the lease we were given a domain name: sbx07338.cambrma.wayport.net . What our local DNS resolver decides to do with host www.google.com , since it potentially isn't fully-qualified, is append the domain name from the DHCP lease to it (eg www.google.com.sbx07338.cambrma.wayport.net in the first iteration) and try to resolve that. When the resolution fails, it tries appending decreasingly specific parts of the DHCP domain name, finds that all of them fail, and then gives up and tries to resolve plain old www.google.com . This works, and we get back IP address 173.194.35.104 . A whois after the fact confirms that this is Google:

jesstess@pretzel-logic:~$ whois 173.194.35.104 NetRange: 173.194.0.0 - 173.194.255.255 CIDR: 173.194.0.0/16 OriginAS: AS15169 NetName: GOOGLE ... OrgName: Google Inc. OrgId: GOGL Address: 1600 Amphitheatre Parkway City: Mountain View StateProv: CA

173.194.35.104

/

http://nmd.sbx07338.cambrma.wayport.net/index.adp?

MacAddr=00%3a23%3a6C%3a8F%3a95%3a3F&IpAddr=192%2e168%2e5%2e87&

vsgpId=a45946c6%2d737a%2d11dd%2d8436%2d0090fb2004bc&vsgId=93196&

UserAgent=&ProxyHost=

We complete a TCP handshake withand make an HTTP GET request for the resource we want (). Instead of getting back an HTTP 200 and the Google home page, we receive a 302 redirect to

Our MAC address and IP address are conveniently encoded in the redirect URL.

So what is going on here? Why didn't we get back the Google home page?

Our DHCP server/router, 192.168.5.1 , is capturing our HTTP traffic and redirecting it to a special landing page. We don't get to make it out to Google until we finish playing a game with the coffee shop.

Let's dwell on this for a moment, because it's kind of amazing that the way the Internet is designed, our gateway router can hijack our HTTP requests and we can't stop it. In this case, we can see that the URL has changed in our browser after the redirect, but if a malicious gateway were transparently proxying our HTTP requests to an evil malware-laden clone of www.google.com , we'd have no way to notice because there wouldn't be a redirect and the URL wouldn't change.

Worrisome? Definitely, if you're trusting a gateway with sensitive information. If you don't want to have to trust your gateway, you have to use point-to-point encryption: HTTPS, SSH, your favorite IPSec or SSL VPN, etc. And then hope there aren't bugs in your secure protocol's implementation.

Well, ain't nothing to it but to do a DNS lookup on the host name in the redirect ( nmd.sbx07338.cambrma.wayport.ne ) and make our request there:

nmd is a host in the domain from our DHCP lease, so our local resolver's rules manage to resolve it in one try, and we get IP address 98.98.48.198 . This is incidentally the IP address of the DHCP Server Identifier we received with our DHCP lease.

We try our HTTP GET request again there and get back an HTTP 200 and a landing page (still not the Google home page), which the browser renders.

The landing page has some ads and terms of service, and a button to click that we're told will grant us Internet access. That click generates an HTTP POST:

Step 4: Get to Google

98.98.48.198

Having agreed to the terms of service,communicates to our gateway router that our MAC address (which was passed in the redirect URL) should be added to a whitelist, and our traffic shouldn't be captured and redirected anymore -- our HTTP packets should be allowed out onto the Internet.

98.98.48.198 responds to our POST with a final HTTP 302 redirect, this time to starbucks.yahoo.com . We do a final DNS lookup, make our HTTP GET, and get served an HTTP 200 and a webpage with some ads enticing us to level up our coffee addiction. We now have real Internet access and can get to http://www.google.com .

And that's the story! This ability to hijack HTTP traffic at a gateway until terms are met has over the years facilitated a huge industry based around private WiFi networks at coffee shops, airports, and hotels.

It is also a reminder about just how much control your gateway, or a device pretending to be your gateway, has when you use insecure protocols. Upside-down-ternet is a playful example of exploiting the trust in your gateway, but bogus DNS resolution or transparently proxying requests to malicious sites makes use of this same principle.

~jesstess