Here we outline a number of empirical findings that motivate both our question and the main assumptions behind our model. We then describe the proposed agent-based toy model of meme diffusion and compare its predictions with the empirical data. Finally we show that the social network structure and our finite attention are both key ingredients of the diffusion model, as their removal leads to results inconsistent with the empirical data.

We validate our model with data from Twitter, a micro-blogging platform that allows many millions of people to broadcast short messages through social connections. Users can “follow” interesting people, by which a directed social network is formed. Posts (“tweets”) appear on the screen of followers. People can forward (“retweet”) selected posts from their screen to their followers. Furthermore, users often mark their posts with topic labels (“hashtags”). Let us use these tags as operational proxies to identify memes. A retweet carries a meme from user to user. As a meme spreads in this way, it forms a cascade or diffusion network such as those illustrated in Fig. 1. We collected a sample of retweets that include one or more hashtags, produced by Twitter users over a specific period of time (see details in Methods section). This provides us with a quantitative framework to study the competition for attention in the wild.

Figure 1 Visualizations of meme diffusion networks for different topics. Nodes represent Twitter users and directed edges represent retweeted posts that carry the meme. The brightness of a node indicates the activity (number of retweets) of a user and the weight of an edge reflects the number of retweets between two users. (a) The #Japan meme shows how news about the March 2011 earthquake propagated. (b) The #GOP tag stands for the US Republican Party and as many political memes, displays a strong polarization between people with opposing views. Memes related to the “Arab Spring” and in particular the 2011 uprisings in (c) #Egypt and (d) #Syria display characteristic hub users and strong connections, respectively. Full size image

Limited attention

We first explore the competition among memes. In particular, we test the hypothesis that the attention of a user is somewhat independent from the overall diversity of information discussed in a given period. Let us quantify the breadth of attention of a user through Shannon entropy S = −Σ i f(i) log f(i) where f(i) is the proportion of tweets generated by the user about meme i. Given a user who has posted n messages, her entropy can be as small as 0, if all of her posts are about the same meme; or as large as log n if she has posted a message about each of n different memes. We can measure the diversity of the information available in the system analogously, defining f(i) as the proportion of tweets about meme i across all users. Note that these entropy-based measures are subject to the limits of our operational definition of a meme; finer or coarser definitions would yield different values.

In Fig. 2 we compare the daily values of the system entropy to the corresponding average user entropy. The key observation here is that a user's breadth of attention remains essentially constant irrespective of system diversity. This is a clear indication that the diversity of memes to which a user can pay attention is bound. With the continuous injection of new memes, this indirectly suggests that memes survive at the expense of others. We explicitly assume this in the information diffusion model presented later.

Figure 2 Plot of daily system entropy (solid red line) and average user breadth of attention (dashed blue line). Days in our observation period are ranked from low to high system entropy, therefore the latter is monotonously increasing. Full size image

User interests

It has been suggested that topical interests affect user behavior in social media29,30. This is a potentially important ingredient in a model of meme diffusion, as an interesting meme may have a competitive advantage. Therefore we wish to explore whether user interests, as inferred from past behavior, are predictive of future behavior.

Let us consider every user in our dataset and any retweets they produce. When a user u emits a new retweet, we define her interests I u as the set of all memes about which she has tweeted up to that moment. We also collect the set M 0 of memes associated with the new retweet. The n most recent posts across all users prior to the new retweet are considered as a set of potential candidates that might have been retweeted, but were not. The corresponding sets of memes M 1 , M 2 , …, M n are recorded (n = 10). We compute the similarity sim(M 0 , I u ), sim(M 1 , I u ), …, sim(M n , I u ) between the user interests and the actual and candidate posts and recover the conditional probability P(retweet(u, M)|sim(M, I u )) that u retweets a post with memes M given the similarity between the memes and her user interests. We turn to the Maximum Information Path similarity measure31,32 that considers shared memes but discounts the more common ones:

where x is a meme and f(x) the proportion of messages about x.

Fig. 3 shows that users are more likely to retweet memes about which they posted in the past (Pearson correlation coefficient ρ = 0.98). This suggests that memory is an important ingredient for a model of meme competition and we explicitly take this aspect into account in the model presented below.

Figure 3 Relationship between the probability of retweeting a message and its similarity to the user interests, inferred from prior posting behavior. Full size image

Empirical regularities

In Fig. 4 we observe several regularities in the empirical data. We first consider meme lifetime, defined as the maximum number of consecutive time units in which posts about the meme are observed; meme popularity, defined as the number of users per day who tweet about a meme, measured over a given time period; and user activity, defined as the number of messages per day posted by a user, measured over a time period. These three quantities all display long-tailed distributions (Fig. 4(a,b,c)). The excellent collapse of the curves demonstrates that the distributions are robust even if measured over different time units or observed over different periods of time. We further measure the breadth of user attention, defined earlier through the meme entropy. Although the entropy distribution is peaked, some users have broad attention while others are very focused (Fig. 4(d)). This distribution is also robust with respect to different periods of time.

Figure 4 Empirical regularities in Twitter data. (a) Probability distribution of the lifetime of a meme using hours (red circles), days (blue squares) and weeks (green triangles) as time units. In the plot, units are converted into hours. Since the distributions are well approximated by a power law, we can align the curves by rescaling the y-axis by λ–α, where λ is the ratio of the time units (e.g., λ = 24 for rescaling days into hours) and α ≈ 2.5 is the exponent of the power law (via maximum likelihood estimation33). This demonstrates that the shape of the lifetime distribution is not an artifact of the time unit chosen to define the lifetime. (b) Complementary cumulative probability distribution of the popularity of a meme, measured by the total number of users per day who have used that meme. This and the following measures were performed daily (filled red circles), weekly (filled blue squares) and monthly (filled green triangles). (c) Complementary cumulative probability distribution of user activity, measured by the number of messages per day posted by a user. (d) Probability distribution of breadth of user attention (entropy), based on the memes tweeted by a user. Note that the larger the number of posts produced, the smaller the non-zero entropy values recorded for users who focus on a small set of memes. This explains why the distributions for longer periods of time extend further to the left. Full size image

All of these empirical findings point to extremely heterogenous behaviors; some memes are extremely successful (popular and persistent), while the great majority die quickly. A small fraction of memes therefore account for the great majority of all posts. Likewise, a small fraction of users account for most of the traffic. These heterogeneities can in principle be attributed to a variety of causes. The broad distributions of meme popularity could result from a diversity in some intrinsic meme value, with “important” memes attracting more attention. Long-lived memes might be sustained exogenously by traditional media and real-world events. User activity and breadth of attention distributions could be a reflection of innate behavioral differences. What is, then, a minimal set of assumptions necessary to interpret this empirical data? One way to tackle this question is to start from a minimalist model of information spreading that assumes none of the above externalities. In particular we will explore to what extent the statistical features of memes and users can be accounted by the limited attention capacity of the users coupled with the heterogeneity of their social connections.

Model description

Our basic model assumes a frozen network of agents. An agent maintains a time-ordered list of posts, each about a specific meme. Multiple posts may be about the same meme. Users pay attention to these memes only. Asynchronously and with uniform probability, each agent can generate a post about a new meme or forward some of the posts from the list, transmitting the corresponding memes to neighboring agents. Neighbors in turn pay attention to a newly received meme by placing it at the top of their lists. To account for the empirical observation that past behavior affects what memes the user will spread in the future, we include a memory mechanism that allows agents to develop endogenous interests and focus. Finally, we model limited attention by allowing posts to survive in an agent's list or memory only for a finite amount of time. When a post is forgotten, its associated meme become less represented. A meme is forgotten when the last post carrying that meme disappears from the user's list or memory. Note that list and memory work like first-in-first-out rather than priority queues, as proposed in models of bursty human activity34. In the context of single-agent behavior, our memory mechanism is reminiscent of the classic Yule-Simon model∼\cite{yule-simon43, Cattuto3001200744}.

The retweet model we propose is illustrated in Fig. 5. Agents interact on a directed social network of friends/followers. Each user node is equipped with a screen where received memes are recorded and a memory with records of posted memes. An edge from a friend to a follower indicates that the friend's memes can be read on the follower's screen (#x and #y in Fig. 5(a) appear on the screen in Fig. 5(b)). At each step, an agent is selected randomly to post memes to neighbors. The agent may post about a new meme with probability p n (#z in Fig. 5(b)). The posted meme immediately appears at the top of the memory. Otherwise, the agent reads posts about existing memes from the screen. Each post may attract the user's attention with probability p r (the user pays attention to #x, #y in Fig. 5(c)). Then the agent either retweets the post (#x in Fig. 5(c)) with probability 1 − p m , or tweets about a meme chosen from memory (#v triggered by #y in Fig. 5(c)) with probability p m . Any post in memory has equal opportunities to be selected, therefore memes that appear more frequently in memory are more likely to be propagated (the memory has two posts about #v in Fig. 5(d)). To model limited user attention, both screen and memory have a finite capacity, which is the time in which a post remains in an agent's screen or memory. For all agents, posts are removed after one time unit, which simulates a unit of real time, corresponding to N u steps where N u is the number of agents. If people use the system once weekly on average, the time unit corresponds to a week.

Figure 5 Illustration of the meme diffusion model. Each user has a memory and a screen, both with limited size. (a) Memes are propagated along follower links. (b) The memes received by a user appear on the screen. With probability p n , the user posts a new meme, which is stored in memory. (c) Otherwise, with probability 1 – p n , the user scans the screen. Each meme x in the screen catches the user's attention with probability p r . Then with probability p m a random meme from memory is triggered, or x is retweeted with probability 1 – p m . (d) All memes posted by the user are also stored in memory. Full size image

Simulation results

The model has three parameters: p n regulates the amount of novelty that enters the system (number of cascades), p r determines the overall retweet activity (size of cascades) and p m accounts for individual focus (diversity of user interests). We estimated all three directly from the empirical data (see Methods).

The social network underlying the meme diffusion process is a critical component of the model. To obtain a network of manageable size while preserving the structure of the actual social network, we sampled a directed graph with 105 nodes from the Twitter follower network (details in Methods). The nodes correspond to a subset of the users who generated the posts in our empirical data. To evaluate the predictions of our model, we compare them with empirical data that includes only the retweets of the same subset of users. To study the role played by the network structure in the meme diffusion process, we also simulated the model on a random Erdös-Rényi (ER) network with the same number of nodes and edges. As shown in Fig. 6, the model captures the main features of the empirical distributions of meme lifetime and popularity, user activity and breadth of user attention. The comparison with the corresponding distributions generated using the ER network shows that in general, the heterogeneity of the observed quantities is greatly reduced when memes spread on a random network. This is not unexpected. Consider for example meme popularity (Fig. 6(b)); the real social network has a broad (scale free, not shown) distribution of degree, with a consistent number of hub users who have a large number of followers. Memes spread by these users are likely to achieve greater popularity. This does not happen in the ER network where the degree distribution is narrow (Poissonian). The difference observed in the distribution of breadth of user attention, for both low and high entropy values (Fig. 6(d)), may be explained by the heterogeneity in the number of friends. Users with few friends may have low breadth of attention while those with many friends are exposed to many memes and thus may exhibit greater entropy.

Figure 6 Evaluation of model by comparison of simulations with empirical data (same panels and symbols as in Fig. 4). To study the role played by the network structure in the meme diffusion process, we simulate the model on the sampled follower network (solid black line) and a random network (dashed red line). Both networks have 105 nodes and about 3 × 106 edges. (a) The definition of lifetime uses the week as time unit. (b,c,d) Meme popularity, user activity and user entropy data are based on weekly measures. Full size image

The second key ingredient of our model is the competition among memes for limited user attention. To evaluate the role of such a competition on the meme diffusion process, we simulated variations of the model with stronger or weaker competition. This was accomplished by tuning the length t w of the time window in which posts are retained in an agent's screen or memory. A shorter time window (t w < 1) leads to less attention and thus increased competition, while a longer time window (t w > 1) allows for attention to more memes and thus less competition. As we can observe in Fig. 7, stronger competition (t w = 0.1) fails to reproduce the large observed number of long-lived memes (Fig. 7(a)). Weaker competition (t w = 5), on the other hand, cannot generate extremely popular memes (Fig. 7(b)) nor extremely active users (Fig. 7(c)).

Figure 7 Evaluation of model by comparison of simulations with empirical data (same panels and symbols as in Fig. 4). To study the role of meme competition, we simulate the model on the sampled follower network with different levels of competition; posts are removed from screen and memory after t w time units. We compare the standard model (t w = 1, solid black line) against versions with less competition (t w = 5, dot-dashed magenta line) and more competition (t w = 0.1, dashed red line). (a) The definition of lifetime uses the week as time unit. (b,c,d) Meme popularity, user activity and user entropy data are based on weekly measures. Full size image