1. Surowicki, J. The Wisdom of Crowds. Why the Many Are Smarter Than the Few (Doubleday Books, New York, NY, 2004).

2. Page, S. E. The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies (Princeton Univ. Press, Princeton, NJ, 2007).

3. Clemen, R. T. Combining forecasts: a review and annotated bibliography. Int. J. Forecast. 5, 559–583 (1989).

4. Armstrong, J. S. in Principles of Forecasting: A Handbook for Researchers and Practitioners (ed. Armstrong, J. S.) 417–439 (Kluwer Academic, Norwell, MA, 2001).

5. Timmermann, A. in Handbook of Economic Forecasting Vol. 1 (eds Elliot, G. et al.) 135–196 (Elsevier, Amsterdam, 2006).

6. Kurvers, R. H. J. M., Krause, J., Argenziano, G., Zalaudek, I. & Wolf, M. Detection accuracy of collective intelligence assessments for skin cancer diagnosis. JAMA Dermatol. 151, 1346–1353 (2015).

7. Wolf, M., Krause, J., Carney, P. A., Bogart, A. & Kurvers, R. H. J. M. Collective intelligence meets medical decision-making: the collective outperforms the best radiologist. PLoS ONE 10, e0134269 (2015).

8. Kurvers, R. H. J. M. et al. Boosting medical diagnostics by pooling independent judgments. Proc. Natl Acad. Sci. USA 113, 8777–8782 (2016).

9. Kämmer, J. E., Hautz, W. E., Herzog, S. M., Kunina-Habenicht, O. & Kurvers, R. H. J. M. The potential of collective intelligence in emergency medicine: pooling medical students’ independent decisions improves diagnostic performance. Med. Decis. Making 37, 715–724 (2017).

10. Sanders, F. On subjective probability forecasting. J. Appl. Meteorol. 2, 191–201 (1963).

11. Staël von Holstein, C.-A. An experiment in probabilistic weather forecasting. J. Appl. Meteorol. 10, 635–645 (1971).

12. Vislocky, R. L. & Fritsch, J. M. Improved model output statistics forecasts through model consensus. Bull. Am. Meteorol. Soc. 76, 1157–1164 (1995).

13. Baars, J. A. & Mass, C. F. Performance of national weather service forecasts compared to operational, consensus, and weighted model output statistics. Weather Forecast. 20, 1034–1047 (2005).

14. Vul, E. & Pashler, H. Measuring the crowd within: probabilistic representations within individuals. Psychol. Sci. 19, 645–647 (2008).

15. Kelley, T. L. The applicability of the Spearman–Brown formula for the measurement of reliability. J. Educ. Psychol. 16, 300–303 (1925).

16. Stroop, J. R. Is the judgment of the group better than that of the average member of the group? J. Exp. Psychol. 15, 550–562 (1932).

17. Preston, M. G. Note on the reliability and the validity of the group judgment. J. Exp. Psychol. 22, 462–471 (1938).

18. Eysenck, H. J. The validity of judgments as a function of the number of judges. J. Exp. Psychol. 25, 650–654 (1939).

19. Hogarth, R. M. A note on aggregating opinions. Organ. Behav. Hum. Perform. 21, 40–46 (1978).

20. Galton, F. Vox populi. Nature 75, 450–451 (1907).

21. Galton, F. The ballot-box. Nature 75, 509–510 (1907).

22. Galton, F. Memories of My Life (Methuen & Co, London, 1908).

23. Gordon, K. Group judgments in the field of lifted weights. J. Exp. Psychol. 7, 398–400 (1924).

24. Jenness, A. The role of discussion in changing opinion regarding a matter of fact. J. Abnorm. Soc. Psychol. 27, 279–296 (1932).

25. Gordon, K. Further observations on group judgments of lifted weights. J. Psychol. 1, 105–115 (1935).

26. Klugman, S. F. Group judgments for familiar and unfamiliar materials. J. Gen. Psychol. 32, 103–110 (1945).

27. Treynor, J. L. Market efficiency and the bean jar experiment. Financ. Anal. J. 43, 50–53 (1987).

28. Blackwell, C. & Pickford, R. The wisdom of the few or the wisdom of the many? An indirect test of the marginal trader hypothesis. J. Econ. Finan. 35, 164–180 (2011).

29. Lorenz, J., Rauhut, H., Schweitzer, F. & Helbing, D. How social influence can undermine the wisdom of crowd effect. Proc. Natl Acad. Sci. USA 108, 9020–9025 (2011).

30. Ariely, D. et al. The effects of averaging subjective probability estimates between and within judges. J. Exp. Psychol. Appl. 6, 130–147 (2000).

31. Herzog, S. M. & Hertwig, R. The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping. Psychol. Sci. 20, 231–237 (2009).

32. Müller-Trede, J. Repeated judgment sampling: boundaries. Judgm. Decis. Mak. 6, 283–294 (2011).

33. Rauhut, H. & Lorenz, J. The wisdom of crowds in one mind: how individuals can simulate the knowledge of diverse societies to reach better decisions. J. Math. Psychol. 55, 191–197 (2011).

34. Herzog, S. M. & Hertwig, R. Think twice and then: combining or choosing in dialectical bootstrapping? J. Exp. Psychol. Learn. Mem. Cogn. 40, 218–232 (2014).

35. Krueger, J. I. & Chen, L. J. The first cut is the deepest: effects of social projection and dialectical bootstrapping on judgmental accuracy. Soc. Cogn. 32, 315–336 (2014).

36. Herzog, S. M. & Hertwig, R. Harnessing the wisdom of the inner crowd. Trends Cogn. Sci. 18, 504–506 (2014).

37. Dehaene, S., Izard, V., Spelke, E. & Pica, P. Log or linear? Distinct intuitions of the number scale in Western and Amazonian indigene cultures. Science 320, 1217–1220 (2008).

38. Dehaene, S. Number Sense. How the Mind Creates Mathematics (Oxford Univ. Press, Oxford, 1997).

39. Nieder, A. Counting on neurons: the neurobiology of numerical competence. Nat. Rev. Neurosci. 6, 177–190 (2005).

40. Siegler, R. S. & Opfer, J. E. The development of numerical estimation: evidence for multiple representations of numerical quantity. Psychol. Sci. 14, 237–243 (2003).

41. Siegler, R. S. & Booth, J. L. Development of numerical estimation in young children. Child Dev. 75, 428–444 (2004).

42. Booth, J. L. & Siegler, R. S. Developmental and individual differences in pure numerical estimation. Dev. Psychol. 42, 189–201 (2006).

43. Bertelli, I., Lucangeli, D., Piazza, M., Dehaene, S. & Zorzi, M. Numerical estimation in preschoolers. Dev. Psychol. 46, 545–551 (2010).

44. Hooker, R. Mean or median. Nature 75, 487–488 (1907).

45. Genest, C. & Zidek, J. V. Combining probability distributions: a critique and an annotated bibliography. Stat. Sci. 1, 114–135 (1986).

46. Dawid, A. P. et al. Coherent combination of experts’ opinions. Test 4, 263–313 (1995).

47. Genre, V., Kenny, G., Meyler, A. & Timmermann, A. Combining expert forecasts: can anything beat the simple average? Int. J. Forecast. 29, 108–121 (2013).

48. Baron, J., Mellers, B. A., Tetlock, P. E., Stone, E. & Ungar, L. H. Two reasons to make aggregated probability forecasts more extreme. Decis. Anal. 11, 133–145 (2014).

49. Satopää, V. A. et al. Combining multiple probability predictions using a simple logit model. Int. J. Forecast. 30, 344–356 (2014).

50. Larrick, R. P. & Soll, J. B. Intuitions about combining opinions: misappreciation of the averaging principle. Manage. Sci. 52, 111–127 (2006).

51. Mannes, A. E. Are we wise about the wisdom of crowds? The use of group judgments in belief revision. Manage. Sci. 55, 1267–1279 (2009).

52. Fraundorf, S. H. & Benjamin, A. S. Knowing the crowd within: metacognitive limits on combining multiple judgments. J. Mem. Lang. 71, 17–38 (2014).

53. Hourihan, K. L. & Benjamin, A. S. Smaller is better (when sampling from the crowd within): low memory-span individuals benefit more from multiple opportunities for estimation. J. Exp. Psychol. Learn. Mem. Cogn. 36, 1068–1074 (2010).

54. Steegen, S., Dewitte, L., Tuerlinckx, F. & Vanpaemel, W. Measuring the crowd within again: a pre-registered replication study. Front. Psychol. 5, 786 (2014).