Pokémon Red is a fairly complex game with puzzles, quests and a turn-based combat system underlying the whole game. Even more, the introduction of the anarchy/democracy modes completely reshaped the whole dynamics of the game. As such, it is not possible to properly study the behavior of the crowd during the whole game without taking into account the particular challenge it faced at each moment. For this reason, in the following, we will focus on two particular aspects of the game. In the first subsection, we will study the event know as “the ledge”. This is the name the crowd gave to a small area that is extremely easy to finish for a single player but that represented a hard challenge for them, needing 15 hours to finish it. Furthermore, in this area, the optimal strategy is quite clear. These two factors, namely, a clear strategy and the long time they were stuck there, added to the fact that this took place before the introduction of the anarchy/democracy modes, makes this area a great case study of the collective behavior of the crowd at short timescales. Then, on the second subsection, we will analyze the political movements that evolved within the game, once the anarchy/democracy modes were introduced, giving us information about the behavior of the crowd at longer timescales.

The ledge

On the third day of the game, the character arrived to the area depicted in Fig. 3 (note that the democracy/anarchy system had not been introduced yet). Each node of the graph represents a tile of the game. The character starts on the yellow node on the left part of the network and has to exit through the right part, an event that we will define as getting to one of the yellow nodes on the right. The path is simple for an average player but it represented a challenge for the crowd due to the presence of the red nodes. These nodes represent ledges which can only be traversed going downwards, effectively working as a filter that allows flux only downwards. Thus, one good step will not cancel a bad step, as the character would be trapped down the ledge and will have to find a different path to go up again. For this reason, this particular region is highly vulnerable to actions deviating from the norm, either caused by mistake or performed intentionally by griefers, i.e., individuals whose only purpose is to annoy other players and who do so by using the mechanisms provided by the game itself [38, 39] (note that in social contexts these individuals are usually called trolls [40]). Indeed, there are paths (see blue nodes in Fig. 3) where only the command right is needed and which are next to a ledge so that the command down, which is not needed at all, will force the crowd to go back and start the path again. Additionally, the existence of the lag described in Sect. 2 made this task even more difficult.

Figure 3 Network representation of the ledge area. It is possible to go from a node to the ones surrounding it using the commands up, right, down and left. The only exception are the red nodes which are ledges. If the character tries to step on one of those nodes it will be automatically sent to the node right below it, characteristic that is represented by the curved links connecting nodes above and below ledges. Yellow nodes mark the entrance and exit of the area and blue nodes highlight the most difficult part of the path. Note that as the original map was composed by squared tiles this network representation is not an approximation but the exact shape of the area Full size image

In Fig. 4(a) we show the time evolution of the amount of messages containing each command (the values have been normalized to the total number of commands sent each minute) since the beginning of this part until they finally exited. First, we notice that it took the crowd over 15 hours to finish an area that can be completed by an optimal walk in less than 2 minutes. Then, we can clearly see a pattern from 2d18h30m to the first time they were able to reach the nodes located right after the blue ones, approximately 3d01h10m: when the number of rights is high the number of lefts is low. This is a signature of the character trying to go through the blue nodes by going right, falling down the ledge, and going left to start over. Once they finally reached the nodes after the blue path (first arrival) they had to fight a trainer controlled by the game, combat which they lost and as a consequence the character was transported outside of the area and they had to enter and start again from the beginning. Again, we can see a similar left-right pattern until they got over that blue path for the second time, which in this case was definitive.

Figure 4 Study of the Ledge event. (a) Time evolution of the fraction of commands sent each minute. Note that a single player should be able to finish this area in a few minutes, but the crowd needed 15 hours. The time series has been smoothed using moving averages. (b) Hierarchical clustering of the time series of each group of users (see main text for details). (c) Left: Mean time needed to exit the area according to our simulations as a function of the fraction of griefers in the system and the noise in it. Right: 1% quantile of the time needed to exit the area, note that the y axis is given in minutes instead of hours Full size image

The ledge is a great case study of the behavior of the crowd because the mechanics needed to complete it is very simple (just moving from one point to another), which facilitates the analysis. At the same time, it took the players much longer to finish this area than what is expected for a single player. To address all these features, we propose a model aimed at mimicking the behavior of the crowd. Specifically, we consider a nth order Markov Chain so that the probability of going from state \(x_{m}\) to \(x_{m+1}\) depends only on the state \(x_{m-n}\), thus accounting for the effect of the lag of the dynamics. Furthermore, the probabilities of going from one state to another will be set according to the behavior of the players in the crowd.

To define these probabilities, we first classify the players in groups according to the total number of commands they sent in this period: G1, users with 1 or 2 commands (46% of the users); G2, 3 or 4 commands (18%); G3, between 5 and 7 commands (13%); G4, between 8 and 14 commands (12%); G5, between 15 and 25 commands (6%); and G6, more than 25 commands (5%). These groups were defined so that the total number of messages sent by the first three is close to 50,000 and 100,000 for the other three. Interestingly, the time series of the inputs of each of these groups are very similar (see Additional files 1–7). Actually, if we remove the labels of the 42 time series and cluster them using the euclidean distance, we obtain 7 clusters, one for each command. Even more, the time series of each of the commands are clustered together, Fig. 4(b). In other words, the behavior of users with medium and large activities are not only similar to each other, but they are also equivalent to the ones coming from the aggregation of the users who only sent 1 or 2 commands. This allows us to infer the behavior of the whole crowd by looking at the bahavior of the most active players, group 6.

In our Markovian model, if we set the probabilities so that the next state in the transition is always the one that gets you closer to the exit but with 25 seconds of delay (that is, the probability of going from state \(x_{m}\) to \(x_{m+1}\) is the probability of going from \(x_{m-n}\) to the state which would follow the optimal path at \(x_{m-n+1}\)), the system gets stuck in a loop and is never able to reach the exit. However, that would require all players to be sending exactly the same command at the same time, something that is not seen in the data nor expected in a (uncontrolled) crowd. Thus, we consider that each time step there are 100 users with different behaviors introducing commands. In particular, we consider variable quantities of noisy users who play completely at random, griefers who only press down to annoy the rest of the crowd and the herd who always sends the optimal command to get to the exit. The results, Fig. 4(c), show that the addition of noise to the herd breaks the loops and allows the crowd to get to the exit (similar results are obtained for either 20 or 30 seconds of lag, see Additional file 8). In particular, for the case with no griefers we find that with 1 percent of users adding noise to the input the mean time needed to finish this part is almost 3,000 hours. However, as we increase the noise, time is quickly reduced with an optimal noise level of around 40% of the crowd. Conversely, the introduction of griefers in the model, as expected, increases the time needed to finish this part in most cases. Interestingly though, for low values of the noise, the addition of griefers can actually be beneficial for the crowd. Indeed, by breaking the herding effect, these players are unintentionally helping the crowd to reach their goal.

Whether the individuals categorized as “noise” were producing it unintentionally or doing it on purpose to disentangle the crowd (an unknown fraction of users were aware of the effects of the lag and they tried to disentangle the system [41]) is something we can not analyze because, unfortunately, the resolution of the chat log in this area is in minutes and not in seconds. We can, however, approximate the fraction of griefers in the system thanks to the special characteristics of this area. Indeed, as most of the time the command down is not needed—on the contrary, it would destroy all progress—, we can categorize those players with an abnormal number of downs as griefers. To do so, we take the users that belong to G6 (the most active ones) and compare the fraction of their inputs that corresponds to down between each other. We find that 7% have a behavior that could be categorized as outlier (the fraction of their input corresponding to down is higher than 1.5 times the inter quartile range). More restrictively, for 1% of the players, the command down represents more than half of their inputs. Both these values are compatible with the observed time according to our model, even more if we take into account that the model is more restrictive as we consider that griefers continuously press down (not only near the blue nodes). Thus, we conclude that users deviating from the norm, regardless of being griefers, noise or even very smart individuals, were the ones that made finishing this part possible.

Anarchy vs. democracy

As already described, on the sixth day of the game the input system was modified. This resulted in the start9 riot that led to the introduction of the anarchy/democracy system. From this time on, if the fraction of users sending democracy, out of the total amount of players sending the commands anarchy or democracy, went over 0.75 (later modified to 0.80) the game would enter into democracy mode and commands would be tallied up for 5 seconds. Then, the meter had to go below 0.25 (later modified to 0.50) to enter into anarchy mode again. Note that these thresholds were set by the creator of the experiment.

The introduction of the voting system was mainly motivated by a puzzle where the crowd had been stuck for over 20 hours with no progress. Nonetheless, even in democracy mode, progress was complex as it was necessary to retain control of the game mode plus taking into account lag when deciding which action to take. Actually, the tug-of-war system was introduced at the middle of day 5, yet the puzzle was not fully completed until the beginning of day 6, over 40 hours after the crowd had originally arrived to the puzzle. One of the reasons why it took so long to finish it even after the introduction of the voting system is that it was very difficult to enter into democracy mode. Democracy was only “allowed” by the crowd when they were right in front of the puzzle and they would go into anarchy mode quickly after finishing it. Similarly, the rest of the game was mainly played under anarchy mode. Interestingly, though, we find that there were more “democrats” in the crowd (players who only voted for democracy) than “anarchists” (players who only voted for anarchy). Out of nearly 400,000 players who participated in the tug-of-war throughout the game, 54% were democrats, 28% anarchists and 18% voted at least once for both of them. Therefore, the introduction of this new system did not only split the crowd into two polarized groups with, as we shall see, their own norms and behaviors, but also created non trivial dynamics between them.

To explore the dynamics of these two groups, we next compare two different days: day 6 and day 8. Day 6 was the second day after the introduction of the anarchy/democracy dynamics and there were not any extremely difficult puzzles or similar areas where democracy might have been needed. On the other hand, day 8 was the day when the crowd arrived to the safari zone, which certainly needed democracy mode since the available number of steps in this area is limited (see description of Additional file 9). We must note that, contrary to what we observed in Sect. 3.1, in this case commands coming from low activity users are not equivalent to the ones coming from high activity users. In particular, low activity users tend to vote much more for democracy (see Additional files 9 and 10). As such, it would not be adequate to remove them from the analysis. Our results are summarized in Fig. 5.

Figure 5 Politics of the crowd. Days 6 (top) and 8 (bottom). In every plot the gray color represents when the game was played under anarchy rules and the blue color when it was played under democracy rules. The polar plots represent the evolution of the fraction of votes corresponding to anarchy/democracy while distinguishing if the user previously voted for anarchy or democracy: first quadrant, votes for anarchy coming from users who previously voted for anarchy (\({A} \rightarrow {A}\)); second quadrant, votes for democracy coming from anarchy (\({A}\rightarrow {D}\)); third quadrant, votes for democracy coming from democracy (\({D}\rightarrow {D}\)); fourth quadrant, votes for anarchy coming from democracy (\({D}\rightarrow {A}\)). In the other plots we show the evolution of the total number of votes for anarchy or democracy as a function of time normalized by its maximum value (pink) as well as the position of the tug-of-war meter (black). When the meter goes above 0.75 the system enters into democracy mode (blue) until it reaches 0.25 (these thresholds were later changed to 0.80 and 0.50 respectively) when it enters into anarchy mode (gray) again. The gap in the pink curve of picture (d) is due to the lack of data in that period (see Availability of data and materials) Full size image

One of the most characteristic features of groups is their polarization [42, 43]. The problem in the case we are studying is that as players were leaving the game while others were constantly coming in, it is not straightforward to measure polarization. The fact that the number of votes for democracy could increase at a given moment did not mean that anarchists changed their opinion, it could be that new users were voting for democracy or simply that players who voted for anarchy stopped voting. Then, to properly measure polarization we consider 4 possible states for each user. They are defined by both the current vote of the player and the immediately previous one (note that we have removed players who only voted once, but this does not affect the measure of the position of the meter, see Additional file 9): \(A\rightarrow A\), first anarchy then anarchy; \(A \rightarrow D\), first anarchy then democracy; \(D \rightarrow D\), first democracy then democracy; \(D \rightarrow A\), first democracy then anarchy. As we can see in Figs. 5(A) and 5(C) the communities are very polarized, with very few individuals changing their votes. The fraction of users changing from anarchy to democracy is always lower than 5%, which indicates that anarchists form a very closed group. Similarly, the fraction of users changing from democracy to anarchy is also very low, although there are clear bursts when the crowd exits the democracy mode. This reflects that those who changed their vote from anarchy to democracy do so to achieve a particular goal, such as going through a mace, and once they achieve the target they instantly lose interest in democracy.

With such degree of polarization the next question is how was it possible for the crowd to change from one mode to the other. To do so, we shift our attention to the number of votes. In Fig. 5(B) we can see that every time the meter gets above the democracy threshold it is preceded by an increase in the total number of votes. Then, once under democracy mode the total number of votes decays very fast. Finally, there is another increment before entering again into anarchy mode. Thus, it is clear that every time democrats were able to enter into their mode they stopped voting and started playing. This let anarchists regain control even though they were less users, leading to a sharp decay of the tug of war meter. Once they exited democracy mode, democrats started to vote again to try to set the game back into democracy mode. In Fig. 5(D) we can see initially a similar behavior in the short periods when democracy was installed. However, there is a wider area were the crowd accepted the democracy, this marks the safari zone mentioned previously. Interestingly, we can see how democrats learned how to keep their mode active. Initially there was the same drop on users voting and on the position of the meter seen in the other attempts. This forced democrats to keep voting instead of playing, which allowed them to retain control for longer. Few minutes later the number of votes decays again but in this case the position of the meter is barely modified probably due to anarchists finally accepting that they needed democracy mode to finish this part. Even though they might have implicitly accepted democracy, it is worth noting that the transitions \(A \rightarrow D\) are minimum (Fig. 5(C)). Finally once the mission for which the democracy mode was needed finished, there is a sharp increment in the fraction of transitions \(D \rightarrow A\).