Identifying individuals who are influential in diffusing information, ideas or products in a population remains a challenging problem. Most extant work can be abstracted by a process in which researchers first decide which features describe an influencer and then identify them as the individuals with the highest values of these features. This makes the identification dependent on the relevance of the selected features and it still remains uncertain if triggering the identified influencers leads to a behavioral change in others. Furthermore, most work was developed for cross-sectional or time-aggregated datasets, where the time-evolution of influence processes cannot be observed. We show that mapping the influencer identification to a wisdom of crowds problem overcomes these limitations. We present a framework in which the individuals in a social group repeatedly evaluate the contribution of other members according to what they perceive as valuable and not according to predefined features. We propose a method to aggregate the behavioral reactions of the members of the social group into a collective judgment that considers the temporal variation of influence processes. Using data from three large news providers, we show that the members of the group surprisingly agree on who are the influential individuals. The aggregation method addresses different sources of heterogeneity encountered in social systems and leads to results that are easily interpretable and comparable within and across systems. The approach we propose is computationally scalable and can be applied to any social systems where behavioral reactions are observable.

The aggregation method we propose: (1) takes the variation of influence process into account; (2) addresses different sources of heterogeneity specific to social systems; (3) leads to results that are interpretable and comparable within and across systems. To illustrate our approach we collected data from online news discussions from three, large independent news providers: CNN, The Atlantic and The Telegraph. We show that following the approach we propose, it is straightforward to reveal those users who are consistently the most influential. The method we propose is computationally scalable and can be applied to any social systems where such behavior reactions can be observed. Our results show that under this mapping, the temporality of the data provides in fact a simplification of the influencer identification. This supports a recent study [ 25 ] which shows that temporal complexity may in fact simplify certain problems if seen through the right perspective.

We present a framework in which the individuals in a social group repeatedly evaluate the contribution of other members according to what they perceive as valuable and propose a method to aggregate the individual evaluations into a collective judgement. In doing so, we incorporate the behavioral reaction of the social group in the influencer identification. This allows us not to make any assumption on what are the relevant features of the influencers, but to let each individual decide on his own, based on the preferences and beliefs held at that point in time.

In this article, we show that mapping the influencer identification problem into a wisdom of crowds one overcomes these limitations. The wisdom of crowds phenomenon [ 27 ] was first described by Galton [ 28 ] when he observed that social groups can make more accurate collective judgements than expert individuals [ 27 , 28 ]. Since then, this phenomenon has raised great interest among both researchers and practitioners. People have been shown to make surprisingly accurate judgements when their opinions are aggregated and this concept has been applied to solve a large variety of problems, from prediction markets to informed policy making [ 27 , 29 ]. The idea also made its way into mainstream applications, being an important mechanism behind creating content on social information sites such as Wikipedia, Quora or Stackoverflow.

In general, influencers can be described by a combination of three factors: personification of values (who one is), competence (what one knows) and strategic network location [ 2 , 6 ]. Most existing identification methods are constructed by selecting one or several features belonging to these factors and identifying the individuals with the highest values of these features. Such features range from psychological traits [ 2 ] to expertise [ 21 ] or position in the social network (e.g. betweenness centrality [ 14 ], eigenvector centrality [ 13 ], node accessibility [ 17 ], k-shell [ 16 ], dynamical influence [ 18 ], expected force [ 19 ] or collective influence [ 20 ]). An important limitation of this kind of approach is that the selection of relevant features is done a-priori by the researcher or practitioner, according to his own subjective preferences and thus the identification of influencers strongly relies on the assumed relevance of the selected features. Furthermore, most proposed features do not incorporate a behavioral reaction of the members of the social group to the actions of the others, which makes it even more difficult to identify a best set of features. Hinz et al. [ 9 ] have shown that influencers identified as individuals with either high degree or betweenness centrality are better spreaders of information than individuals with low degree. On the other hand, Watts et al. [ 5 ] have shown that except few, rather uncommon cases, influencers identified as central nodes in the social network are not significantly more influential than peripheral ones. This evidence against an universal set of features describing influencers can be explained by the complexity of the influence process. Personal influence has been shown to operate through several latent mechanisms (e.g. contact, socialisation, status competition, social norms) which have a different impact across the five stages of the decision process (knowledge, persuasion, decision, implementation, confirmation) [ 22 , 23 ]. Under these circumstances, selecting a set of features that describe influencers is difficult without detailed knowledge of the context in which the influence process takes place. Furthermore, most methods can only be applied to a time-aggregated dataset [ 13 , 14 , 16 – 20 ], which neglects the inherent temporal nature of the influence relationships. There exist several attempts to extend methods to the temporal case [ 24 ], but the problem is far from being solved. The widespread belief is that adding a temporal layer leads to much more complex objects, whose study requires the development of sophisticated tools [ 25 , 26 ].

Firms, political parties and organisations increasingly rely on engineering social contagion to spread products, ideas or behaviours. Already for more than half of a century researchers have realised that a relatively small number of people can have a great impact on the opinions and behaviour of many others. The concept of opinion leaders (influencers or influentials [ 1 , 2 ]) was first introduced by Katz [ 3 ] in the study of the two step model of communication flow between the mass media and the public and since then it has been revisited in a plethora of studies across many academic disciplines [ 4 – 12 ]. Extensive research has shown that influencers drive new product adoption [ 4 , 6 , 8 , 9 ], public health policies [ 12 ] or voting behaviour [ 2 ]. In consequence, a large body of literature has been devoted to the identification of influencers [ 7 , 13 – 20 ], which is still considered today as one of the most important and challenging problems [ 7 , 12 , 20 ].

The IP is at its root a statistical aggregation and, as any statistical measure, is susceptible to bias arising from small samples. This bias can be induced in two ways: (1) if within an event there are few participants; (2) if an individual takes part in a low number of events. In events with few participants (thus implicitly few judges), the IP scores might be biased as they violate one of the critical assumptions behind the wisdom of crowds: a large number of evaluations. To address this we penalise event ranks obtained in small events by introducing the constant term c in the event rank normalisation. By changing c one can emphasise or diminish the role of the event size in computing the IP. S10 Fig illustrates the impact c has on computing the event ranks. For large c, high IP values can only be obtained in the limit of large events, while for small c the effect of the event size on the event rank is negligible. This has practical implications for studying dynamical processes like information propagation where the size of the susceptible population plays an important role. The second source of small sample bias is the small number of events attended by an individual. In this case the IP might not be informative for the latent potential to influence as by aggregating few data points the results are subjected to randomness. We can address this by setting a threshold on the minimum number of events attended by each individual and remove from the analysis those who attended less. The threshold can be seen as a measure of confidence in the results. The higher the threshold, the higher is the minimum number of events attended by each individual and thus the lower the likelihood that high vote scores are obtained in most events by chance.

The variance term is introduced to penalise the lack of consistency in the ranks obtained. The IP reflects the extent to which most participants in an event consistently appreciated the contribution of the individual each time he was active. Notice that we do not impose a criteria on how the contribution should be evaluated. The IP is bounded in the interval [0, 1] (see S3 Appendix ). A value close to zero is obtained for individuals who either consistently rank low in the votes distribution or have a high variation in the votes score across all the events they participated in. Such individuals have a low potential to influence as, either their contribution is rarely appreciated or this happens with a high level of uncertainty, questioning their inherent abilities. On the other hand, a value close to one can only be obtained for individuals who always collect the most votes in the events they participate. Such individuals have a high potential to influence as, due to some construct we do not directly observe, they always attract the highest evaluation. An implicit assumption made in Eq 1 is that the activity of an individual (defined as the the number of events attended) is not alone informative for the latent potential to influence but is rather an opportunity for the latent potential to influence to be manifested.

We consider that individuals become influential due to a latent construct they possess which reflects their knowledge and skills, as well as preferences and beliefs. We call this unobserved construct the latent potential to influence. This potential is revealed during social interactions and can be evaluated by other participants through a voting system (e.g. up-votes on discussion platforms). While traditional methods use features set a priori by the practitioner or the scientist, our method uses the crowd’s judgement, expressed through votes. In this way we incorporate the behavioral reaction of others into the influencer identification. Operationalising influence in terms of votes reflects both the heterogeneity in skills and knowledge between contributors and the heterogeneity in preferences and beliefs between the evaluators. The latent potential to influence is uncovered by aggregating the individual evaluations. Commonly used methods include the total number of positive evaluations (variations of this are commonly used in social information sites), the mean or the median [ 28 ]. However, when applied to systems characterised by a heavy-tailed distribution of variables describing the system (like many social media platforms), such methods might be biased as the quantities they aggregate are not directly comparable. To address all previous shortcomings, we develop the influence potential (IP), a new aggregation method. In the remainder of the article we will use the term event to describe a time-window capturing social interactions between individuals. Without loss of generality, the events take place at different points in time, which implies there is a sense of temporality in the data. However, this assumption is not restrictive and the events can also be concurrent. For every event we rank all participants in increasing order of votes received and compute the event rank of an individual in an event as the rank normalised by the total number of participants in the event plus a constant. Formally, the event rank of individual i in event t is defined as R t (i) = rank(i)/(n t + c), where n t represents the number of participants in event t (event size of t) and c is an additive constant which controls for the event size. Further, let be the set of events where i participated. The influence potential (IP) of an individual is the mean normalised rank over the events where he participated minus the respective variance. That is, the influence potential of individual i is (1)

Results

Data collection We collected the complete history of online discussions over a long period of time from three large news providers: CNN, The Atlantic and The Telegraph. Such platforms offer an interactive environment in which users have the possibility to express their views, engage in discussions and possibly shape other’s view on the topic. Registered users can post comments in discussion threads and, at the same time, react and evaluate the quality of the posts through a voting tool provided by the platform. The default ordering of the posts on the platform is determined by the number of votes received. The discussions cover a broad range of topics, each thread belonging to one topic category which defines the overall topic of the discussion (e.g. politics, business, etc.). The categories are defined by the news providers and are directly available on the website. Discussion threads for which it was not possible to identify the category have been omitted from the analysis. All platforms are comparable in terms of user experience as they are based on the same technology, provided by Disqus. An overview of the three datasets can be found in Table 1 (approximative figures) and a detailed description of the categories in S1–S3 Tables. In our terminology, the discussion threads represent the events, the contribution of an individual in an event is defined by the total number of posts made in the thread, and the evaluation of the contribution is defined by the number of up-votes received by all posts made in the thread. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 1. Overview of data analyzed. https://doi.org/10.1371/journal.pone.0200109.t001

Identification of influencers We investigate if for each topic there are individuals who consistently receive most votes each time they are active. In the remainder of the article we use c = 1 and consider only individuals who participate in at least 10 events. In S1 Fig we show the results are robust to the choice of c and later in the article we show the IP is robust to the number of events observed per individual. In Fig 1, upper panels, we show the relationship between the mean event rank (x axis) and the corresponding variance (y axis). It can be seen there are several individuals with a high mean event rank and a low variance (illustrated with dark blue colour code). In consequence, in the lower panels of Fig 1 we observe a heavy tailed distribution of the IP, with several individuals having high values. This is consistent with existing literature, which states that there are just a few influencers compared to the entire population [5]. This result is rather surprising as we would expect a high disagreement between the participants in an event because what is a valuable contribution is decided by each individual on his own, based on his own preferences and beliefs. We later consider two parsimonious mechanisms and show that none can completely explain the results. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 1. Identification of influencers. Data is pooled from all categories. An individual can be described by multiple data points, each being related to his performance in one category. Number of observations: CNN (115,186), Atlantic (20,136), Telegraph (102,795). The colour of the points is given by the IP. Upper Panels: Relationship between the mean event rank (x axis) and the corresponding variance (y axis). There is an inverted U-shape relationship between the mean and the variance of the event ranks. The individuals with high mean and a low variance have the highest IP. Lower Panels: Distribution of the IP. https://doi.org/10.1371/journal.pone.0200109.g001

Zooming in topic categories We now investigate how the nomination of influencers varies across the topic categories. By doing so, we are able to identify category influencers. Fig 2 contains a boxplot of the highest 100 IP scores within each category. To ease the representation, for each dataset we selected the top ten categories with the highest number of users. We re-labelled each category according to its ranking in terms of number of users among the categories within the same dataset, C1 representing the highest. The list of abbreviations together with the number of users in each category can be found in S1–S3 Tables. Fig 2 shows there is a considerable difference between the highest influencer scores across the different categories (p < 10−16, n = 1000, ANOVA test). For example, on the CNN platform, in the first five categories (C1-C5: world, us, opinion, politics, justice) the influencer scores of the top 100 individuals are considerably higher than in the following five (C6-C10: showbiz, tech, health, travel, living). This has practical implications for designing intervention campaigns based on targeting influencers as it shows that in the same system, the extent to which individuals agree on who and what is influential might change depending on the context. Very often in literature it is considered that the extent to which an individual is influential is determined by one or several feature he possesses [2, 6] and it is neglected how, for the same individual, the impact of these features on his perceived influence can vary across different settings. In Fig 3 we selected the top five individuals with the highest IP in the five largest categories and plotted their IP scores across the categories. If an individual did not participate in a category, it was represented by a blank cell in the figure. It can be seen that: (1) most individuals participate in very few categories and (2) individuals who participate in more categories have high IP scores only in few. To explore this effect further, we considered the highest ranked 100 individuals in each of the five categories (N = 500) and computed the pairwise Pearson correlation between the scores. S15 Fig shows a high variation in the pairwise correlation between the categories, in all three datasets. Taken together, the results suggest that individuals who are influential across topics are hard to find, possibly because an important component of the latent potential to influence is the topic expertise [2]. This finding is in line with early studies which showed that opinion leadership is topic dependent, with different degrees of overlap between the topics [30]. However, in recent studies this is very often neglected as influencers are mostly identified based only on one (often structural) feature [13, 14, 16–20]. Targeting for example a well connected individual who is expert in politics to promote a healthy behaviour has a high risk to fail. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 2. Influencers within topic categories. The x axis represents the topic category. The y axis represents the IP scores of the top 100 individuals with the highest IP within the category. The categories are ordered by the number of users. https://doi.org/10.1371/journal.pone.0200109.g002 PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 3. Influencers across categories. The x axis represents the IP scores of the top five individuals with the highest IP from the five largest topic categories on each platform. Several individuals appear within top five in more than one category thus the number of observations for each dataset is different (CNN: 18, Atlantic: 22, Telegraph: 25). The y axis represents the topic category. If an individual did not participate in a category, it is represented by a blank cell. Most individuals participate in few categories. Individuals who participate in more categories have high IP scores in only few. https://doi.org/10.1371/journal.pone.0200109.g003

Different aggregation methods We compare our aggregation method against three alternatives often encountered in research or practice: the total number of votes (used regularly on social information sites to rank users), the mean and median [28] number of votes. For every topic, we rank all users according to the four methods and calculate the degree of overlap between the highest ranked users. Fig 4 shows the results. The total number of votes leads to considerably different results, with the lowest overlap with the other methods. One reason is that this method does not control for the difference in activity between individuals nor for the difference in size between events. On the other hand, the highest similarity can be observed between the mean the median. Both methods control for the difference in activity between individuals, but not for the event size. In addition, the median is not sensitive to extreme evaluations which can explain the higher difference observed in the CNN dataset. The IP is closest to the median, with a significant but not high overlap between the two. Compared to existing aggregation methods, the IP has several appealing features. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Fig 4. Comparison of results under different aggregation methods. We compare the overlap between the highest ranked individuals by different methods. The x axis represents the number of highest ranked individuals. The y axis represents the overlap between the highest ranked individuals by two methods. Data is pooled from all topic categories. The mean and the median are the most similar. The IP is closest to the median. https://doi.org/10.1371/journal.pone.0200109.g004 First, penalizing the mean event rank through the variance term in Eq 1 allows to identify individuals who consistently rank high and thus who are consistently outperforming others in obtaining the crowd’s votes. Second, it addresses different sources of heterogeneity often encountered in social systems. A predominant characteristic of most social systems (including news platforms) is that there is a heavy tailed distribution of the variables describing the system. S2–S4 Figs show there is a large difference in the number of participants in the events. As the total number of votes in an event is proportional to the event size (S5 Fig), it implies that we cannot directly compare the number of votes received in events of different sizes. If we use the mean or the median to aggregate the votes, we are making the implicit assumption that the votes obtained are comparable and thus one vote in an event needs to worth the same as one vote in any other event. However, this assumption is questionable as the events are heterogeneous in the total number of votes cast, even for the same event size (see S5 Fig). Thus, in some events it could be easier to obtain votes than in others. In consequence, an aggregation method like the mean or the median, could be biased towards participation in large events. This is illustrated in S14 Fig, where we can see that for two individuals with the same mean number of upvotes, their ranking in terms of votes obtained within a thread is very different. By using normalized rankings, the IP always considers the vote scores relative to the event, thus controls for the total number of votes cast in the event. In doing so, one vote in an event where few people receive votes worths more that one vote in an event where many people receive votes. Furthermore, the number of events attended by an individual is as well described by a heavy-tailed distribution (S6–S8 Figs). This implies that comparing users in terms of the total number of votes received, as it is done on most social information sites, will favor individuals who are very active. To infer the latent potential to influence, our approach does not take into account the number of events attended (once a minimum number has been achieved). In this way we are, to some extent, separating the tendency of individuals to be active from their latent potential to influence, making the influencer scores comparable across individuals with different levels of activity. Third, the aggregation method we propose provides normalised results, that are easy to interpret and to compare within and across systems. Individuals who are influential have IP scores close to one, while non-influential individuals have scores close to zero. Because of this, the extent to which somebody is influential can be directly inferred from his IP score, without the need of additional information about the system, like with the other aggregation methods. For example, in order to understand if a certain mean value is high or low, one needs information about the distribution of votes in the events.