Once we have a reliable measure of controversiality, not only can we find and rank controversial issues in WPs, but we actually begin to see important phenomena and common characteristics of wars and disputes. Here we report our findings on the temporal characteristics of edits on high and low controversiality pages. We make use of the fact that in the WP dump a timestamp with one second precision is assigned to each edit. One month of activity (the time-line of all edits irrespective of who performed them) on two sample articles are depicted in Figure 4.

West et. al. [48] and Adler et. al. [26] have developed vandalism detection methods based on temporal patterns of edits. In both studies the main assumption is that offensive edits are reverted much faster than normal edits, and therefore, by considering the time interval between an arbitrary edit and its subsequent reverts, one can classify vandalized versions with high precision.

is normalized to the final value . Cycles of peace and war appear consequently, activated by internal and external causes.

Figure 14. Evolution of controversy measure with number of edits of of Iran – the insets depict focuses of some of the local war periods.

Histogram of number of edits between two successive war periods for a selected sample of 44 articles which are not driven by external events. The average value of is 1300 edits.

Most of the articles are frequently edited. Figure 5 shows the empirical probability density function of the average time between two successive edits. As already noted in [35] edit frequency also depends on the controversiality of a page, and one expects higher edit frequency for more controversial pages. However, as Figure 6 makes clear, the correlation is quite weak (correlation coefficient ).

Burstiness.

It is clear that edits are clustered in a way that there are many edits done in a rather short period, followed by a rather long period of silence. This feature is known in the literature as burstiness [49], [50], and is quantified based on the coefficient of variation by a simple formula as (3)where and denote respectively the mean and standard deviation of the interval between successive edits. We have calculated for all the articles in the sample, considering all the edits made on them by any user. As it can be seen from Figure 7, overall burstiness of edits correlates rather weakly with controversiality ( ).

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 16. Evolution of controversy measure with number of edits of Anarchism and Barack Obama. is normalized to the final value . There is no consensus even for a short period and editorial wars continue nonstop. https://doi.org/10.1371/journal.pone.0038869.g016

To see the impact of controversiality on burstiness we calculated for different groups of articles separately: Disputed articles , Listed articles coming from the List of controversial articles in WP [51], Randomly selected articles, and Featured articles (assumed to be least controversial given WPs stringent selection criteria for featuring an article). The histograms in Figure 8(A) show the PDF of in these four classes. As can be seen, the peaks are shifted to the right (higher ) for more controversial articles, but not strongly enough to base the detection of controversy on burstiness of editorial activity alone. Reverting is a useful tool to restore vandalized articles, but it is also a popular weapon in heated debates. Figure 8(B) shows the distribution of calculated not for all edits, but for reverts alone: the shift is now more marked. Finally, we considered an even stronger form of warfare: mutual reverts. It is evident that the temporal pattern of mutual reverts provides a better characterization of controversiality than that of all edits or all reverts, and the very visible shift observed in Figure 8(C) constitutes another, albeit less direct, justification of our decision to make mutual reverts the central element in our measure of controversiality.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 17. Relative share of each category at different M Relative share of each category at different M Blue: category (a), consensus. Red: category (b), multi-consensus. Yellow: Category (c), never-ending war. For the precise definition of each category see the main text. https://doi.org/10.1371/journal.pone.0038869.g017

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 18. Length distribution of articles and talk pages with log-normals fits. The distribution of articles length is better described by a log-normal distribution compared to the talk length distribution, which tends to be more like a power-law. https://doi.org/10.1371/journal.pone.0038869.g018

To gain a better understanding of the microdynamics of edit wars, we selected two samples of 20 articles each, extracted from a pool of articles with average successive edit time intervals of 10 hours . One sample contains the most controversial articles in the pool with , whereas the other one contains the most peaceful articles with . The probability distribution of time between edits for these samples is shown in Figure 9. Both samples have a rather fat-tailed distribution with a shoulder in the distribution (as observed both in the empirical data and the model calculation), indicating that a characteristic time, seconds (one day), is present in the system. However, only the sample consisting of controversial articles displays a clear power-law distribution, ,with . All exponents were calculated by applying the Gnuplot implementation [52] of the nonlinear least-squares Marquardt-Levenberg algorithm [53] on the log-binned data with an upper cut-off to avoid system size effects.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 19. Scatter plot of talk page vs. article length. Color coding is according to logarithm of the density of points. The correlation between the length of the article and the corresponding talk page is weak, . https://doi.org/10.1371/journal.pone.0038869.g019

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 20. Scatter plot of talk page length vs. . Color coding is according to logarithm of the density of points. There is a rather clear correlation, between the length of the talk page and the controversality of the article. https://doi.org/10.1371/journal.pone.0038869.g020

To fit the data depicted in Figure 9, we used a model based on a queuing mechanism introduced in [50] and further developed in [54]. Here we briefly explain its basis and how we use it to model our empirical findings. Let us assume that there is a list of articles and there is only one editor (mean-field approximation) who edits at each step once. With probability , the editor selects the article to edit from the list randomly and with no preference among choices. With probability the articles will be selected according to a priority which is assigned randomly to the article after each edit on it. The key parameters are and the real time associated to the model time step. Controversial articles are fitted well by close to 1 and small . Uncontroversial articles fit with large and smaller , in nice agreement with the real situation, where editors tend to edit a few controversial articles more intentionally and many peaceful articles in a more or less uncorrelated manner with no bias and memory. To check the validity of the model, we calculated the ratio of the number of controversial articles (with ) to the rest of the articles ( ) to be , which is in nice agreement with the fitting model parameters, 20/500 = 0.04.

Another important characteristic quantity is the autocorrelation function . To calculate it, first we produce a binary series of 0/1 similar to the one in Figure 4. Then is computed simply as (4)

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 21. Network representation of editors’ interactions in the discussion page of Safavid dynasty. Each circle is an editor, red arrows represent comments opposing the target editor, T-end green lines represent positive comments (agreeing with the other editor), and yellow lines with round end represent neutral comments. Line thickness is proportional to the number of times that the same interaction occurs. Data based entirely on subjective assessments (manual review). https://doi.org/10.1371/journal.pone.0038869.g021

where stands for the time average over the whole series. for the same samples of controversial and peaceful articles are shown in Figure 10. We calculate the same quantity for a shuffled sequence of events as a reference. The shuffled sequence has the same time interval distribution as the original sequence, but with a randomized order in the occurrence of events. In both cases, a power-law of describes very well. Usually it is assumed that slow (power law) decay of the autocorrelation function is an indicator for long time memory processes. However, if independent random intervals taken from a power law distribution separate the events, the resulting autocorrelation will also show power law time dependence [55], [56]. Assuming that the exponent of the independent inter-event time distribution is and the exponent of the decay of the correlation function is , we have the relationship . Deviations from this scaling law reflect intrinsic correlations in the events.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 22. Average Average, color coded for different M’s and n’s. For a wide range of articles and in a long time of their lives , the relative contribution of the top 5 most reverting pair of editors, is very close to 1, making clear the important role of the top 5 pairs of fighting editors. https://doi.org/10.1371/journal.pone.0038869.g022

There is another measure which indicates long time correlations between the events even more sensitively. Take a period to be bursty if the time interval between each pair of successive edits is not larger than , and define as the number of events in the bursty periods. If events in the time series are independent and there is no memory in the system (i.e. in a Poisson process), one can easily show that should have an exponential decay, whereas in the presence of long range memory, the decay is in the form of [56]. In Figure 11, is shown for samples of highly controversial and peaceful articles. In the high controversy sample a well defined slope of -2.83 is observed, while in the low controversy sample edits are more independent and is very close to the one obtained for the shuffled sequence. Note that by shuffling the sequence of time intervals, all the correlations are eliminated and the resulting sequence should mimic the features of an uncorrelated occurrence of the intervals.

The same measurements are performed for a sample of users, see Figure S1 in Supporting Information. In Table.1, a summary of the scaling exponents for the both article samples and users is reported.

The simplest explanation of these results is to say that conflicts induce correlations in the editing history of articles. This can already be seen in Figure 10, where shuffling influences the decay of the autocorrelation functions much more for high- articles than for low- ones. For the more sensitive measure the original and the shuffled data are again quite close to each other for the low- case, while a power-law type decay can be observed in the empirical data for high- articles.