We review products independently, but we may earn affiliate commissions from buying links on this page. Terms of use.

Every 11 minutes, Dr. John Heidemann's team "pings" 4 million networks to ascertain if they are live, looking for patterns and outliers. If a nation-state shuts down their country's web access (as Egypt did in 2011) or a hurricane hits, taking out major utilities and communication networks, Professor Heidemann will know what's going on.

PCMag visited him at the Analysis of Network Traffic (ANT) Lab within the USC Information Sciences Institute (ISI), where they do both traffic and topology. Founded in 1972, ISI played a crucial role in the development of the early internet; it was one of the earliest nodes on ARPANET and the organization that created the DNS system. Today, ISI's 350 researchers undertake research for many federal agencies (including DARPA, Homeland Security, and the Air Force Office of Scientific Research), as well as partner with companies like Lockheed Martin, Raytheon, and Northrop Grumman.

"Here in the networks division, we develop network protocols and address issues of cyber security, denial of service attacks, mapping, and data analytics to make the internet more secure," Dr. Heidemann told PCMag.

Dr. Heidemann started in network simulation in the late 90s, and his work led to the first "internet Censuses" in 2003.

"We wanted to see if we could 'measure the internet'," he told PCMag. "We had noticed these anomalies, while realizing the simple act of pinging could be used to detect when networks fail. And this became important for a bunch of things: telecoms policy, networks and, as IoT comes online, embedded devices. Now we can see to the edge of the public internet—to the routers—but not into your house."

In case you're not familiar with the term, pinging has been around since the early internet as way to verify if a computer is online, sending a packet of data ("ping") to an internet address and waiting for a response (sometimes referred to as a "pong"). But why did Professor Heidemann choose 11 minutes, and not 10, as the sweep interval to obtain data?

"Because we don't want to be aligned with human behavior," he explained. "For example, if your computer is scheduled to come on at the top of the hour for two minutes, we didn't want to just hit it at the same time. Having said that, we often analyze the date by hour, so things go back and forth."

If you want to see Dr. Heidemann's work in action, ANT recently went live with a world visualization map.

"We take all the data we get every 11 minutes from 4 million networks and put it into a grid," Dr. Heidemann said. "So you can see the scale of outages when they occur. For example, the size of the circle is how many networks are down in that area and the color is the percentage of all the networks in that area are down—red being 100 percent—which is what we track during disasters."

In his office, he then manipulated the map to return to the day Hurricane Irma hit Florida. As the storm intensified, circles around the Florida Keys turned white (50 percent network fail).

"The neat thing about Irma is we saw the hurricane hit, people lose power temporarily and then they get back on," said Dr. Heidemann. "Which tells us a lot about the resilience of the power grid and internet networks in that area. Of course it was a different story with Puerto Rico and Hurricane Maria. In fact their networks were down for so long that, eventually, when [we] kept re-running the data analysis, suddenly the region 'disappeared' because after pinging the networks and getting no response for over a week, we interpreted that as gone away instead of temporarily down."

The world visualization map was built using OpenLayers, mapping software, with custom code on top for the base data collection, analysis and algorithm development.

"We're working really hard now to deploy real-time data," said Professor Heidemann. "The processing is currently done in batch, usually quarterly, unless there's an event, like a natural disaster, where we pull it and do the analysis immediately."

Up next, the ANT Lab will take its census and outage data to work on prediction and root cause analysis. For example, "when a hurricane hits X area, what's the likely outcome to Y servers, length of time they'll be affected, cost analysis to affected businesses and communities and so on."

"My big goal in 2018 is two things," said Dr. Heidemann. "I want to reach out to 'real people' because we're at the point, when we have real-time data, and the visualization data you've seen, when a network goes down, you should be able to visit—from another computer, obviously—and get more information. That's the citizen-facing side.

"On the science side there's a ton of work to be done on these years of data. We now have to work out where are the weak parts of the internet? How can we identify the dependencies? How can we improve them? We now have a vast data set—80 billion data points. There's knowledge embedded in there and it will take a lot of work, but we'll get there."

Although the ANT lab is working towards concurrent data displays, there are times when, as Dr. Heidemann, explained, you don't want to share it too widely.

"Most of this work has been supported by the Department of Homeland Security, which has a vested interest in a secure homeland. Other agencies want to support this work to get insights into long-term resilience of our communications networks. However, in some of my security work, particularly in denial of service attacks, we don't want to report it publicly so quickly that the attackers are using our work as evidence of a 'good attack.' Luckily hurricanes aren't adversarial in nature, so I don't see any reason why we don't want to make this information available quickly, as soon as we have it ready."