To build a geographical map, one first has to model the Earth surface, for example, by assuming that it is a sphere. Similarly, we also need a geometric model of the Internet space to build our map. The simplest candidate space is also a sphere, or even a circle, on which nodes are uniformly distributed and connected by an edge, with probability p(d) decreasing as a function of distance d between nodes, conceptually similar to random geometric graphs16. However, this model fails to capture basic properties of the Internet topology, including its scale-free node degree distribution. In an earlier study17, we showed that to generate realistic network topologies in this geometric approach, we first have to assign to nodes their expected degrees κ drawn from a power-law distribution, and then connect pairs of nodes with expected degrees κ and κ′ with probability p(χ), where χ is distance d rescaled by the product of the expected degrees, χ~–d/(κκ′). We thus have a hybrid model that mixes geometry and topology—geometric characteristics, distances d used in random geometric graphs, come in tandem with topological characteristics, expected degrees κ used in classical configuration models of random power-law graphs18. If we associate the expected degree κ of a node with its mass, then the connection probability p(d/(κκ′)), which is a measure of the interaction strength between two nodes, resembles Newton's law of gravitation. Therefore, we call this model Newtonian. However, according to Einstein, we can treat gravity in purely geometric terms if we accept that the space is no longer flat, that is, if it is non-Euclidean. Following this philosophy we showed in an earlier study13 that the Newtonian model is isomorphic to a purely geometric network model, with node degrees transformed into a geometric coordinate, making the space hyperbolic, that is, negatively curved. We call this model Einsteinian.

The main property of hyperbolic geometry is the exponential expansion of space illustrated in Figure 1. For example, the area A(r) of a two-dimensional hyperbolic disc of radius r grows with r as A(r)~er. Consequently, the uniform node density in a hyperbolic space appears as exponentially growing with the distance r from the origin (see Figure 2, illustrating the Einsteinian model). In the model, nodes are indeed distributed (quasi-)uniformly on a hyperbolic disc, and one can show13 that the resulting average degree of nodes exponentially decreases with r. This combination of two exponentials, node density and average degree, leads to the emergence of a scale-free degree distribution in the network. The model is described in the Methods section, and can generate synthetic scale-free networks with any power-law degree distribution exponent and any clustering. Given a real network, our network mapping method, also in the Methods section, reverts the network synthesis in the model. The method uses statistical inference techniques to identify the hyperbolic coordinates for each node in the given network, which would maximize the likelihood that the network is generated by the model. Specifically, the method attempts to find node positions such that the resulting empirical probability of node connections as a function of the hyperbolic distance between nodes would be congruent with the theoretical connection probability in the model.

Figure 1: Hyperbolic geometry at a glance. The exponentially growing number of people lying on the hyperbolic floor illustrates the exponential expansion of the hyperbolic space. All people are of the same hyperbolic size. The Poincaré tool developed by Bill Horn is used to construct the tessellation of the hyperbolic plane in the Poincaré disc model with the Schläfli symbol {9, 3}, rendering an image of the last author. Full size image

Figure 2: Synthetic network in the Einsteinian model. The modelled network illustrates the connection between hyperbolic geometry and scale-free topology of complex networks. All nodes lie within a hyperbolic disc of radius R. The radial node density grows exponentially with the distance from the origin O, whereas the average degree of nodes exponentially decreases. This combination of the exponentially increasing node density and exponentially decreasing average degree yields a power-law degree distribution in the network. The red lines show triangle Oab made of the hyperbolic geodesics (that is, shortest paths in the hyperbolic space) connecting origin O and two nodes a and b. Geodesics and are the solid red lines, whereas geodesic is the dashed curve. The thick blue links show the shortest path between nodes a and b in the network. Full size image

Mapping results

We apply our mapping method to the Internet AS topology extracted from the Archipelago project data19 in June 2009, and visualize the results in Figure 3. We observe striking similarity between this visualization and the synthetic Einsteinian network in Figure 2. To confirm that the Internet map we have obtained is indeed congruent with the Einsteinian model, we juxtapose in Figure 4 the empirical connection probability between ASs in the obtained Internet map against the theoretical one in Equation (4) of the Methods section. We observe a clear similarity between the two. Neither is the sphere a perfect model of the Earth nor is the Einsteinian model an ideal abstraction of the Internet structure. Yet, the observed similarity between the empirical and theoretical connection probabilities in Figure 4 suggests that hyperbolic metric spaces are reasonable representations of the real Internet space.

Figure 3: Hyperbolic atlas of the Internet. The Internet's hyperbolic map is similar to a synthetic Einsteinian network in Figure 2. The size of AS nodes is proportional to the logarithm of their degrees. For the sake of clarity, only ASs with a degree above 3 and only the connections with probability p(x)>0.5 given by Equation (4) of the Methods section are shown. The font size of the country names is proportional to the logarithm of the number of ASs that the country has. Only the names of countries with more than 10 ASs are included. The methods used to map ASs to their countries are described in Supplementary Methods. Full size image

Figure 4: Empirical versus theoretical connection probability. Hyperbolic mapping of the Internet is successful, as the empirical connection probability between ASs of degree larger than 2 in the map closely follows the Einsteinian model prediction. The whole range of hyperbolic distances is binned, and for each bin the ratio of the number of connected AS pairs to the total number of AS pairs falling within this bin is shown. The distances between AS pairs are computed using Equation (3). The blue dashed line is the connection probability given by Equation (4) with R=27 and T=0.69, which are the values used by the mapping method. Full size image

To investigate further the connections between the obtained map and Internet reality, we show in Figure 3 the average angular position of all ASs belonging to the same country, whereas in Figure 5 we draw the angular distributions of those ASs. Surprisingly, we find that even though our mapping method is completely geography agnostic, it discovers meaningful groups or communities of ASs belonging to the same country. Furthermore, in Figure 3, we find many cases of geographically or politically close countries placed close to each other in our hyperbolic map. The explanation of these surprising effects is rooted in the peculiar nature of our mapping method. If ASs belonging to the same country, geographic region or geo-political or economic group are connected more densely to each other than to the rest of the world, then this higher connection density translates to a higher attractive force that tries to place all such ASs close to each other in our map. Indeed, the term p ( x i j ) a i j in Equation (7) of the Methods section corresponds to the attractive force between connected nodes, whereas the term [ 1 - p ( x i j ) ] 1 - a i j is the repulsive force between disconnected ones. This peculiar interplay between attraction within densely connected regions and repulsion across sparsely connected zones effectively maps the ASs belonging to densely connected AS groups closely. These observations build our confidence that our mapping method provides meaningful results reflecting peculiarities of the real Internet structure, and suggest that the method can be adapted to discover the community structure20,21,22 in other complex networks.

Figure 5: Angular positions of ASs belonging to the same country. Hyperbolic mapping of the Internet yields meaningful results, as ASs belonging to the same country are mapped close to each other. The angular distributions of ASs in the 30 largest countries in the world are shown. The 'size' of the country is the number of ASs it has. Column a corresponds to the first 15 countries and column b to the next 15. The graph shows the percentage of ASs per bin of size 3.6°. For the majority of countries, their ASs are localized in narrow regions. Exceptions are the United States, the European Union and the United Kingdom. The first two exceptions are because of the significant geographic spread of ASs belonging to the United States or the European Union, the latter actually representing not one country but a collection of countries. Full size image

Routing results

The obtained Internet map is ready for greedy forwarding. An AS holding a packet reads its destination AS coordinates, computes the hyperbolic distances between this destination and each of its AS neighbours using Equation (3) of the Methods section and forwards the packet to the neighbour closest to the destination. To evaluate the performance of this process, we perform greedy forwarding from each source to each destination AS, and compute several performance metrics.

The first metric is success ratio, which is the percentage of greedy paths that successfully reach their destinations. Not all paths are expected to be successful, as some might run into local minima. For example, an AS might forward a packet to its neighbour who sends the packet back to the same AS, in which case the packet will never reach the destination. We declare a path unsuccessful if the packet is sent to the same AS twice. The average success ratio of simple greedy forwarding in our Internet map is remarkably high, 97%, and more sophisticated greedy forwarding techniques, such as those described in Cvetkovski and Crovella study23, can boost it to 100%. Given the discussed connections between our Internet map and geography, one may conjecture that greedy forwarding simply mimics geographical routing following the geographically shortest paths. However, this conjecture is not true. Geography is reflected in our map only along the angular coordinate, whereas the radial coordinate is a function of the AS degree, making the space hyperbolic (see the Methods section). The geographical space is not hyperbolic, and if we use it for greedy forwarding, we obtain a much lower success ratio of approximately 14%. We also tested modified geographic routing that tries to intelligently use AS degrees, in the spirit of our Einsteinian model. Nevertheless, this modification, although improving the success ratio to 30%, still falls short compared with the results obtained using our hyperbolic map. The details of these experiments with geographical routing can be found in Supplementary Methods.

The second metric is stretch, which tells us how much longer the greedy paths are compared with the shortest paths in the Internet topology. The average stretch is low, 1.1. The average hop-wise length of the shortest paths between selected sources and destinations is 3.49, so that the average length of greedy paths is 3.86. The low value of stretch indicates that greedy paths are close to optimal, that is, they are the shortest paths. The shortest path between nodes a and b in Figure 2, for example, is also the path found by greedy forwarding. Somewhat unexpectedly, the greedy stretch is asymptotically optimal, that is, equal to 1, in scale-free, strongly clustered networks, regardless of what underlying space is used for greedy forwarding12. Low stretch also implies that greedy forwarding causes approximately the same traffic load on nodes as shortest-path forwarding. Given that shortest-path forwarding does not lead to high traffic load in scale-free networks24, this finding allays concerns that hyperbolic forwarding may cause traffic congestion abnormalities25 (see Supplementary Methods).

The two metrics above characterize the performance of greedy forwarding in the static Internet topology. More important than that is how greedy forwarding performs in the dynamic topology, in which links and nodes can fail. We randomly select a percentage of links and nodes, remove them from the mapped Internet, recompute the success ratio and stretch after the removal and finally present the result in the top plots of Figure 6. Even on simultaneous failures of up to 10% of AS links or nodes—catastrophic events never happened in Internet history—we observe only minor de-gradation of the performance of greedy forwarding. That is, even catastrophic levels of damage to the Internet do not significantly affect the performance of greedy forwarding, even though no AS changes its position on the hyperbolic map. A widely popularized feature of complex networks is their robustness with respect to random failures, and the lethality of failures of highest-degree hubs26,27. As expected, we observe in the bottom plots of Figure 6 that removals of such hubs have a more detrimental effect on greedy forwarding as well. However, targeted removal of highest-degree ASs in the Internet is a rather unrealistic scenario, as these large ASs consist of thousands of routers the simultaneous failure of which is a very rare and unlikely event. The explanation for the surprising efficiency of greedy forwarding with respect to random failures lies in the unique combination of the following two properties exhibited by scale-free, strongly clustered networks: high path diversity24, and congruency between hyperbolic geodesics and topologically shortest paths13,15. The latter is illustrated by the similar path patterns of the hyperbolic geodesic and topologically shortest path between nodes a and b in Figure 2: they both first go to the high-degree core of the network, and then exit it in the appropriate direction to the destination. Owing to high path diversity, there are many disjoint shortest paths between the same source and destination, and thanks to the congruency, they all stay close to the corresponding hyperbolic geodesics. Link and node failures affect some shortest paths, but others remain, and greedy forwarding can still find them using the same hyperbolic map.

Figure 6: Greedy forwarding in the mapped Internet. Greedy forwarding performs almost optimally in the mapped Internet, as indicated by the success ratio, p s , and average stretch, , after removal of a given fraction of AS nodes (panel a) or links (panel b). Bottom plots show these two metrics after removing a number of the highest-degree nodes (panel c), and a fraction of links among highest-degree nodes (panel d). The links are first ranked by the product of node degrees that they connect, and then a fraction of top-ranked links are removed. The giant connected component is still present after all removals, but it drops to 85% of the original graph after the removal of 10 hubs. Full size image

Another form of Internet dynamics is its rapid growth over years4,5,28,29. We map the Internet of January 2007 to its hyperbolic space using the same mapping method, and then replay the historical growth of the Internet up to June 2009 with an interval of 3 months. During this two and a half year replay, we keep the AS coordinates, as soon as they are computed, fixed once and forever, whereas the ASs joining the Internet anew, after June 2007, compute their coordinates using a variation of the mapping method that requires only local topological information (see Supplementary Methods). In Figure 7a, we show the performance of greedy forwarding in the resulting maps at each time step, and observe only minor performance degradation, even over long time scales. In a nutshell, the existing AS coordinates are essentially static, as once computed they can stay the same for years.

Figure 7: Performance of greedy forwarding during the replayed historical growth of the Internet (a), and success ratio as a function of the fraction of missing links (b). The initial map quality degrades very slowly with time (a). The Internet is fully mapped only once, in June 2007. The ASs that appear after that date compute their coordinates using only local topological information. Once the coordinates of an AS are computed, they are fixed forever. The average success ratio p s and stretch for greedy forwarding in the resulting collection of maps are shown for each snapshot at 3-month intervals, starting from January 2007 and ending in June 2009. See Supplementary Methods for further details. The success ratio also degrades slowly with the number of missing links (b), and if these missed links are added back, the success ratio increases—the larger the number of missing links, the more the success ratio increases. Scenario 1 (blue squares): (i) a fraction of random links among nodes of degree above 5 in the Internet are removed (30% of removed links in this subgraph correspond to 14% of the total number of links in the Internet); (ii) the resulting graph with emulated missing links is hyperbolically mapped using the same mapping method; (iii) the success ratio in the resulting map is computed. Scenario 2 (red circles): (i) and (ii) are the same as in Scenario 1; (iii) the removed links are added back; (iv) the success ratio is computed. See Supplementary Methods for further details. Full size image

Existing Internet topology measurements including the Archipelago data19 are known to be incomplete and miss some AS links28,29. Therefore, a natural question is how this missing information affects the quality of the constructed map, and the performance of greedy forwarding in it. Intuitively, as the performance of greedy forwarding is robust with respect to link removals, we might expect it to be robust with respect to missing links as well. Moreover, if the constructed map is used in practice, then greedy forwarding will see and use those links that topology measurements do not see. We might thus also intuitively expect greedy forwarding to perform better in practice than we report in this section, simply because those missing links, when used by greedy forwarding, would provide additional shortcuts between potentially remote ASs. We confirm this intuition in Figure 7b with experiments emulating the missing link issue. The success ratio degrades only slowly as a function of the fraction of missing links, whereas if we add the emulated missing links back, then the success ratio increases as expected. Therefore, the routing results reported here should actually be considered as lower bounds for greedy routing performance that can be achieved in practice using the constructed hyperbolic Internet map.