We first present results for a dataset corresponding to civil protests in Spain (15 M movement) that resonated on Twitter, in the period April-May 201123,24. The dataset was obtained from a predefined set of keywords relevant to the movement (section A and Appendix of the Supplementary Materials (SM) describe in depth all datasets used in the work). These data, taken in w-wide sliding windows, contain all the necessary information to build time-resolved bipartite networks –who said what, and when– suitably encoded as a rectangular, time-dependent matrix. Specifically, the Twitter stream is parsed and bipartite graphs –see Fig. 1a and 1b– are built up as follows: first, time windows are set to a fixed, arbitrary, w = t 2 – t 1 width. We then choose the n most active users and the m memes (hashtags) that those users produced within that time interval. This bipartite network is encoded in an n × m rectangular binary matrix, M t , where t indicates the origin of the time window w and M u,h = 1 if user u mentioned the hashtag h within the period spanning from t 1 to t 2 and zero otherwise. This procedure allows generating bipartite networks as time goes on by using a rolling-window scheme to evaluate the evolution of the system, such that a window at time t has a φw overlap with that at time t – w (φ = 0.5 in the results reported here; for φ closer to 1.0 results com at higher resolution, whereas φ = 0.0 implies non-overlapping windows).

Once the networks associated to the 15 M social movement at different times are assembled, we proceed to analyze their structure focusing on two topological characteristics. As the interest is in inspecting whether groups of individuals using the same memes build up, we first look for the optimal modular partition of the nodes through a community detection analysis25,26, applying a simulated annealing heuristics to maximize Barber’s26 modularity Q.

Next, we study whether nested patterns arise in the system. Here we evaluate nestedness following the findings by Bell et al.27,28 and further developed in Staniczenko et al.29, who showed that it is given by the maximum eigenvalue of the (n + m) × (n + m) adjacency matrix of the network, i.e. the square matrix counterpart of M t . As shown in Fig. S2, our results are robust against other existing measures of nestedness (i.e., NODF30). For details on both Q and nestedness, see Materials and Methods, and Sections B and C in the SM.

Figure 2 shows the results of the application of these structural analyses for the 15 M dataset and a window width of w = 1 day. If we focus on the days around which the main demonstrations happened (May 15th and onwards), we see that the network presents a highly nested profile. This alone is a quite interesting result, as it implies that when the activity around certain topic peaks, the user-meme system is highly nested. Note that this scenario is more optimal for information diffusion than a predominantly modular topology, as in the latter architecture information flow can get stuck and never reach throughout the whole system. Thus, our findings contribute yet another example of commonalities between ecological, human15,31 and proto-cultural32 systems –for which we typically have static perspectives (but see ref. 33, 34, 35).

Figure 2: Modularity and nestedness bifurcate at the onset of system-wide attention. The central panel shows the evolution of modularity and nestedness, as standardised z-values. Remarkably, both metrics evolve in a coupled way up to the onset of the main protests (around May 15). At this point, modularity collapses, whereas nestedness continues growing towards its peak value coinciding with the political movement’s central dates –that of the largest demonstrations across the country (May 17–20th). Top panels represent four snapshots of the data –encoded as bipartite networks–, rows and columns are sorted in decreasing connectivity order (for an optimal visualization of nested patterns, if they exist). Similarly, lower panels represent the exact same matrices, where rows and columns are sorted module-wise (for an optimal visualization of the community structure). Full size image

Importantly, we can trace back in time the emergence of the final nested state by inspecting the structure of the matrix M t at different times t. With few exceptions, from the very the beginning of the observation time (April 25, 2011) the network exhibits significant (z Q > 1.96) modularity and nestedness (z λ > 1.96) values. This means that before the general onset of collective attention around the 15 M activity, the (proto-) topic is composed of a set of modules (Fig. 2, bottom left) which hardly interact with the rest of the system. At the same time, the structure of the network is nested (Fig. 2, top left). Both patterns exhibit a coupled growing trend (r = 0.7997) for some time, suggesting that discussion communities become clearer and more internally organized. This picture however changes as the movement gains momentum and consensus arises. Indeed, around the climax of the event (May 15–17) we observe an abrupt transition, i.e., nestedness keeps increasing as modularity collapses in a marked anti-correlated pattern (r = −0.7819). After such transition, the architecture of the network is radically different.

The compelling evidence of nested patterns provides a parsimonious explanation of how large amounts of activity can coexist with natural constraints to attention and memory. The user-meme network self-organizes towards a nested structure minimizing competition and facilitating the coexistence of individual participants36. Even when the network is predominantly modular, nestedness appears to have significant values well beyond random counterparts, which already indicates the existence of an incipient consensus around sub-topics. Moreover, the unraveled structural change to a highly nested-only architecture allows interpreting the evolution of the Spanish mobilization episodes as a build-up effort from segregation (scattered activists acting locally) to coordination (a global movement with a well-defined and shared main message).

Such interpretation in sociological terms can be quantitatively supported if we actually explore the survival conditions under which the topic can persist. To do so, we build a set of synthetic networks that purposefully present an almost perfect modular architecture, and an almost perfect nested structure (see section E.1 in the SM for details), mimicking the initial and “climax” state of the real system, April 25–30 and May 15–20 respectively. To each pair (equal size and equal link density) of these networks, we apply the mutualistic dynamics proposed by Bastolla et al.36 exploring a wide range of model parameters’ values (see Materials and Methods and Section E.2 in SM). The aim is to compare the persistence of these two distinct topologies when equilibrium is reached. The first noticeable finding shows that the nested architecture presents large areas in the parameter space for which the system largely survives, whereas the modular structure does not (Fig. 3a). In all the cases (see additional results in Figs S7 to S11) it is possible (and actually very frequent) to observe high persistence for the nested architecture whereas it is low for the modular one, but never the other way around. In this context, the persistence is defined as the survival of a hashtag or user once the system has become stable, while the survival rate represents the final diversity (i.e., number of users and hashtags in the steady state) relative to the initial collection. Then, the survival area represents the region with a survival rate greater than a given value (see Section E.2 of the SM). We systematically compare the survival areas for pairs of systems with different sizes and densities (Fig. 3b) and two remarkable facts stand out: first, nested architectures consistently out-survive modular ones. Second, the difference in survival areas increases with network size, being narrower for small system sizes. This latter finding suggests the reason why topic-centered bipartite networks in information systems exhibit a modular structure while they remain small-sized: the pressure for an architecture shift remains low, as the transition towards a nested topology does not yet present a critical advantage in terms of the survival of the topic. In other words, when a topic is emerging, and thus its user-meme network is small, it needs to reach a critical mass (here the size) and self-adapt to a nested architecture to increase the likelihood of topic’s survival.

Figure 3: Modular and nested architectures under mutualistic dynamics. Two synthetic networks with the same size (N = 1000) and density (ρ = 0.25), but different architecture (modular, nested), exhibit radically different outcomes when the mutualistic dynamical framework is applied on them via extensive numerical simulations. Left: Persistence as a function of the competition β and mutualism γ terms. For a wide range of parameters the modular network shows poor survival; conversely, the nested architecture performs equally or better than the modular counterpart in any given region. Right: differences in the “survival areas” increase with size, which indicates that the pressure for an architectural shift (modular to nested) grows as new nodes (users and hashtags) join the system. Note that the x-axis in the right panels (“Persistence”) corresponds to the z-axis (color code) in the left panels. All results are averaged over 1000 realizations. Additional results for other sizes and densities can be found in Figs S7 to S11. Full size image

It is possible to get further insights into the microscopic mechanisms behind the modular-to-nested topological transition. As seen from Fig. 2, once the nested patterns begin to dominate the network structure –around the day when the movement fully develops–, nestedness remains at high levels for some time. This makes it possible to consistently track the set of users and memes that accumulate many interactions (generalists) and inspect whether these sets are time-independent. To this end, we identify which nodes and which memes assemble the core37,38 of the network at different times. The core can be thought of as the set of most generalist nodes (users and memes) in the network, see section F of the SM for further details. Figure 4 compares the resemblance to the “reference core” D RC , i.e. similarity between a snapshot’s core (C t ) and the one extracted when the nestedness is maximal (C max ) (see section F in the SM for a definition). Notably, for both w = 12 h (top panel) and w = 3 days (bottom) there is a high turnover in users who occupy the core: in most snapshots t, only 0–10% of the users in C max are also present in C t , even when the network’s architecture has reached the nested stage. Instead, hashtags have a much more stable core –around 20% of the C max is shared during the entire observation window, and values above 50% are reached after the movement onset and beyond. These results suggest that it is the set of generalist memes, rather than the existence of generalist individuals, that takes the burden of the topic’s persistence in time. Indeed, it is less costly to linger on a set of hashtags –the passive elements of the system39– as they are not subject to users’ limitations (sleep, attention focus, etc.), with high volatility of new users who enter and leave the core rather intermittently. As shown in Fig. S4, these results are robust to different window widths.

Figure 4: Topical consistence over time despite user turnover. We track the similarity D RC of the generalist cores of the network in time (C t ) with respect to a fixed reference (the core of the network when the maximum of the nestedness is observed, C max ). Results for different w (12 h in the top panel; 3 days in the lower) show that only hashtags build a stable core, guaranteeing the semantic coherence of the topic across time; whereas the core of users suffers a high rate of turnover, indicating that users are frequently pushed to and from the periphery of the network. Full size image

Finally, to rule out the possibility that our results are specific to socio-political phenomena of the kind of the 15 M movement, we have analyzed an unfiltered dataset of Twitter traffic corresponding to tweets in United Kingdom. As before, bipartite user-hashtag networks are built, but now we chose the subset comprising the top 1,024 most-active users and, independently, the subset of 1,024 most-used hashtags. Note that such independent sampling implies that the corresponding adjacency matrix could be empty –the most active users might not use the most popular hashtags. Results for this dataset show strongly fluctuating patterns for both modularity and nestedness, when measured at large window widths (w > 3 h) –not resembling the more persistent, smoothly developed 15 M movement. This is not surprising, as most online topics that succeed in getting collective attention do not demand for days to brew and emerge, but they arise and decay at very fast time scales4,40. Figure 5 thus shows the results obtained for the UK dataset over a much shorter time scale (w = 1 h), revealing that collective attention around certain topics is reached when the network is maximally nested and minimally modular (with overall r = –0.7126). Here we do not observe coupled modularity-nestedness regimes (r > 0), as the incipient stages of a forming topic go unnoticed in the unfiltered stream. For example, a post hoc inspection of the unfiltered stream revealed the consolidation (but not the incipient stages) of the XLVIII Super Bowl topic, that started on February 3rd, 2013 at 12:30AM CET, showing the highest peak (lowest valley) in the nestedness (modularity) values in the studied period.