Basic features of the motion

The position of each player in the universe, namely the ID number of the sector where the player is currently situated, is logged once a day. In this way the motion of each player becomes a time series of 1,000 sector positions. A jump occurs when a player's sector position changes from one day to the following. The associated length d of a jump is measured in terms of graph distance, an integer value between 1 and d max = 27. The probability distribution of jump distances, computed for all players over the whole observation period, is reported in Figure 2 (a). For d ≤ 15, the distribution is well-fitted by an exponential:

with a characteristic jump length λ ≈ 3. The existence of a typical travel distance, as also recently found in other mobility data5,19, is related to the use of a single transportation mode in Pardus31. This allows to disentangle the intrinsic heterogeneity of the players from the effects due to the presence of different means of transportation9, which might be the cause of the scale-free distributions found in mobile phone or other mobility data sets14,16. It has in fact been suggested that power laws in distance distributions of movement data may emerge from the coexistence of different scales1,32.

Figure 2 Distribution of jump distances and of waiting times. To each player a time series consisting of the sector positions over 1000 days is associated. A jump is said to occur when the sector position in the time series changes from one day to the following. The length d of a jump is measured in terms of graph distance and can take an integer value between 1 and d max = 27, the diameter of the network. (a) The probability distribution of jump distances is reported in a semi-log plot. For d ≤ 15, the distribution follows an exponential with a characteristic length λ ≈ 3. Players can also remain in the same sector for more days, without moving to other sectors. We define as waiting time Δt the number of consecutive days a player spends in only one sector. (b) We show the probability distribution of waiting times Δt in a log-log plot, which is well fitted by a power-law P(Δt) ∼ Δt−β, with β ≈ 2.2. Full size image

In some cases, players stay in the same sector for a number of consecutive days. For instance, 11 of the 1458 considered players, although being active in the game, never jump within the entire observation period. On average, a player does not change sector in approximately 75% of the days. To better characterize the motion, we computed the waiting times Δt (measured in terms of number of days) between all pairs of consecutive jumps, over all players. The distribution of these waiting times, shown in Fig. 2 (b) follows a power-law distribution:

with an exponent β ≈ 2.2, in agreement with other recent measurements on human dynamics33. In addition, we found that the average waiting times of individual players are distributed as a power-law (see Supplementary Fig. 2). This implies a strong heterogeneity in the motion of different players, which is related to the heterogeneity in their general activity (see Supplementary Section S1 and Supplementary Fig. 1).

Mobility reveals socio-economic clusters

Mobility patterns are influenced by the presence of the socio-economic regions in the network, highlighted in colours in Fig. 1. The typical situation is illustrated in Fig. 3 (a), with jumps within the same cluster being preferred to jumps between sectors in different clusters. In order to quantify this effect, we report in Fig. 3 (b), blue circles, the observed number of jumps of length d within the same cluster, divided by the total number of jumps of length d. This ratio is a decreasing function of the distance d and reaches zero at d = 12, since no sectors at such distance do belong to the same cluster. As a null model we report the fraction of sector pairs at distance d which belong to the same cluster, see red squares in the same figure. The significant discrepancy between the two curves indicates that players indeed tend to avoid crossing the borders between clusters. For example, a jump of length d = 8 from one sector to another sector in the same cluster is expected only in 3% of the cases, while it is observed in about 20% of the cases. Now, the propensity of a player to spend long time periods within the same cluster might be simply related to the topology of the network, as in the case of random walkers whose motions are constrained on graphs with strong community structures34. Nodes belonging to the same cluster are in fact either directly connected or are at short distance from one another. This proximity is reflected in the block-diagonal structure of the adjacency matrix A and of the distance matrix D, respectively shown in Fig. 4 (a) and (b). We have therefore checked whether the presence of the socio-economic clusters originally introduced by the developers of the game can be derived solely from the structure of the network. For this reason we adopted standard community detection methods based on the adjacency and on the distance matrix35,36. The results, reported respectively in Fig. 4 (d) and (e), show that detected communities deviate significantly from the clusters, implying that in our online world the socio-economic regions cannot be recovered merely from topological features. In comparison we considered the player transition count matrix M, shown in Fig. 4 (c), which displays a similar block-diagonal structure as A and D, but with the qualitative difference that it contains dynamic information on the system. Figure 4 (f) shows that community detection methods applied to the transition count matrix M reveal almost perfectly all the socio-economic areas of the universe. This finding demonstrates that mobility patterns contain fundamental information on the socio-economic constraints present in a social system. Therefore, a community detection algorithm applied to raw mobility information, as the one proposed here, is able to extract the underlying socio-economic features, which are instead invisible to methods based solely on topology. For a detailed treatment of adopted community detection methods and measures see Supplementary Section S4, Supplementary Table II and Supplementary Figs. 4 and 5.

Figure 3 Influence of socio-economic clusters on mobility. (a) Sketch of jump patterns from a sector i to sectors within the same cluster, j and l and to sectors in a different cluster, j′, l′. Although sectors j′ and l′ have the same graph distance from sector i as sectors j and l respectively, transitions across cluster border have smaller probabilities. (b) Quantitative evidence of the tendency of players to avoid crossing borders. Red squares show the null model, i.e. the fraction of all pairs of sectors at a given distance d being in the same cluster. Blue circles show the fraction of measured jumps leading into the same cluster, per distance. Coincidence of the two curves would indicate that clusters have no effect on mobility. Clearly this is not the case – there is a strong tendency of players to avoid crossing the borders between clusters. Full size image

Figure 4 Extracting communities from network topology and from mobility patterns. (a) The adjacency matrix A of the universe network, (b) the matrix D of shortest path distances and (c) the matrix M of transition counts of player jumps. Each of the three matrices contains 400 × 400 entries, whose values are colour-coded. Sector IDs are ordered by cluster, resulting in the block-diagonal form of the three matrices. We have used modularity-optimization algorithms to extract community structures from the information encoded in the three matrices. Different node colours represent the different communities found, while the 20 different colour-shaded areas indicate the predefined socio-economic clusters as in Fig. 1. The displayed Fowlkes and Mallows index quantifies the overlap of the detected communities with the predefined clusters. The closer is to 1, the better the match, see Supplementary Section S4. (d) Although information contained in the adjacency matrix A allows to find 18 communities, a number close to the real number of clusters, the communities extracted do not correspond to the underlying colour-shades areas ( ). (e) Extracting communities from the distance matrix D only results in 6 different groups ( ). (f) The 23 communities detected using the transition count matrix M reproduce almost perfectly the real socio-economic clusters ( ), with only a few mismatched nodes detected as additional clusters. For more measures quantifying the match of communities, see Supplementary Table II. Full size image

A long-term memory model

In order to characterize the diffusion of players over the network, we have computed the mean square displacement (MSD) of their positions, σ2(t), as a function of time. Results reported in Fig. 5 (a) indicate that, for long times, the MSD increases as a power-law:

with an exponent υ ≈ 0.26. This anomalous subdiffusive behaviour is not a simple effect of the topology of the Pardus universe. In fact, as shown in Fig. 5 (b), gray stars, the simulation of plain random walks on the same network produces a standard diffusion with an exponent υ ≈ 1 up to t ≈ 100 days and then a rapid saturation effect which is not present in the case of the human players.

Figure 5 Diffusion scaling in empirical data and simulated models. (a) The mean square displacement (MSD) of the positions of players follows a power relation σ2(t) ∼ tυ with a subdiffusive exponent υ ≈ 0.26. The inset shows the average probability for a player to return after τ jumps to a sector previously visited. The curve follows a power law with an exponent of α ≈ 1.3 and an exponential cutoff. We report, for comparison, (b) the MSD for various models of mobility. For random walkers and in the case of a Markov model with transition probability π ij = m ij /Σ j m ij we observe an initial diffusion with an exponent υ ≈ 1 and then a rapid saturation of σ2(t), due to the finite size of the network. A preferential return model also shows saturation and does not fit the empirical observed scaling exponent υ. Conversely, a model with long-time memory (Time Order Memory) reproduces the exponent almost perfectly. Such a model makes use of the empirically observed while the Markov model and the preferential return model over-emphasize preferences to locations visited long ago and do not recreate the empirical curve well. Curves are shifted vertically for visual clarity. Full size image

Insights from the previous section suggest that the anomalous diffusion behaviour might be related to the tendency of players to avoid crossing borders. We have therefore considered a Markov model in which each walker moves from a current node i to a node j with a transition probability π ij = m ij /Σ l m il , where m ij is the number of jumps between sector i and sector j, as expressed by the transition count matrix M of Fig. 4 (c). The probabilities π ij are the entries of the transition probability matrix Π, which contains all the information on the day-to-day movement of real players, such as the preference to move within clusters, the length distribution of jumps, as well as the tendency to remain in the same sector. Despite this detailed amount of information used (the matrix Π has 160,000 elements), the Markov model fails to reproduce the asymptotic behaviour of the MSD, see magenta diamonds in Fig. 5 (b). Since the model considers only the position of the individual at its current time to determine its position at the following time, deviations from empirical data appear presumably due to the presence of higher-order memory effects37. For this reason we have considered the recently proposed preferential return model21 which incorporates a strong memory feature. The model is based on a reinforcement mechanism which takes into account the propensity of individuals to return to locations they visited frequently before. This mechanism is able to reproduce the observed tendency of individuals to spend most of their time in a small number of locations, a tendency which is also prevalent in the mobility behaviour of Pardus players (see Supplementary Fig. 3). However, the implementation of the preferential return model on the Pardus universe network is not able to capture the scaling patterns of the MSD, as shown in Fig. 5 (b). The reason is that in the model the probability for an individual to move to a given location does not depend on the current location, nor on the order of previously visited locations. Instead, we observe that in reality individuals tend to return with higher probability to sectors they have visited recently and with lower probability to sectors visited a long time before. Consequently a sector that has been visited many times but with the most recent visit dating back one year has a lower probability to be visited again than a sector that has been visited just a few times but with the last visit dating back only one week.

To highlight this mechanism we measured the return time distribution in the jump-time series (see Methods). In particular, we extracted the probability for an individual to return again (for the first time) to the currently occupied sector after τ jumps. As shown in the inset of Fig. 5 (a), we found that the return time distribution reads

with an exponent α ≈ 1.3. We used this information for constructing a model which takes into account the higher re-visiting probability of recently explored locations. In this way we can capture the long-term scaling properties of movements. Exactly these asymptotic properties are fundamentally relevant for issues of epidemics spreading or traffic management.