Significance The secular rise in intelligence across birth cohorts is one of the most widely documented facts in psychology. This finding is important because intelligence is a key predictor of many outcomes such as education, occupation, and income. Although noncognitive skills may be equally important, there is little evidence on the long-term trends in noncognitive skills due to lack of data on consistently measured noncognitive skills of representative populations of successive cohorts. Using test score data based on an unchanged test taken by the population of Finnish military conscripts, we find steady positive trends in personality traits that are associated with high income. These trends are similar in magnitude and economic importance to the simultaneous rise in intelligence.

Abstract Although trends in many physical characteristics and cognitive capabilities of modern humans are well-documented, less is known about how personality traits have evolved over time. We analyze data from a standardized personality test administered to 79% of Finnish men born between 1962 and 1976 (n = 419,523) and find steady increases in personality traits that predict higher income in later life. The magnitudes of these trends are similar to the simultaneous increase in cognitive abilities, at 0.2–0.6 SD during the 15-y window. When anchored to earnings, the change in personality traits amounts to a 12% increase. Both personality and cognitive ability have consistent associations with family background, but the trends are similar across groups defined by parental income, parental education, number of siblings, and rural/urban status. Nevertheless, much of the trends in test scores can be attributed to changes in the family background composition, namely 33% for personality and 64% for cognitive ability. These composition effects are mostly due to improvements in parents’ education. We conclude that there is a “Flynn effect” for personality that mirrors the original Flynn effect for cognitive ability in magnitude and practical significance but is less driven by compositional changes in family background.

There are many well-documented trends in average physical characteristics and cognitive capabilities of modern humans. Average height and body mass index have been on the rise around the world (1⇓⇓–4). Average IQ scores have increased at a rate of 0.2 SD per decade since the 1950s (5). In this study, we document similar trends in economically valuable personality traits of young adult males, as measured by a standardized test.

Recent findings in economics and psychology show that personality traits, especially conscientiousness and neuroticism, are important predictors of outcomes such as education and income in various populations. The predictive power of personality tests can be higher or lower than that of IQ depending on the measures used (6⇓–8). Although most studies have reported contemporaneous correlations, there is evidence that traits measured at adolescence predict educational attainment and adult income (9⇓⇓⇓–13). Recent studies also show that employment growth has been strong in occupations that require high levels of social skills (14, 15).

Previous evidence on trends in personality traits has been constrained by a lack of high-quality data on representative samples of successive cohorts of the same source population. Comparisons of cross-sectional studies of US college students have shown positive trends over time in traits such as extraversion and narcissism (16⇓–18). However, students who participate in surveys are known to differ systematically from those who do not participate in characteristics such as academic achievement and vocational interests (19⇓–21). Moreover, the selectivity of college admissions has changed over time, which has changed the composition of college student populations by socioeconomic backgrounds (22, 23). There are some studies where the same personality test was given to different cohorts of the same source population at the same age (24⇓⇓–27), but generalizing their findings to wider populations is problematic due to self-selection of survey respondents (19, 20). On the other hand, researchers have used large and representative data on high school seniors in the United States. However, most items in this dataset measure social attitudes and personal values, and researchers have had to construct proxy measures for personality from a small number of items. Results have been mixed; some argue that personality traits have remained stable (28), whereas others claim to find increases in individualistic traits (29, 30).

Our data come from the Finnish Defense Forces (FDF), which has tested all military conscripts since 1982. Finnish men are drafted to military service in the year they turn 18, and most start their service at age 19 or 20. Both cognitive ability and personality tests are taken in the second week of military service in standardized group-administered conditions. Due to the comprehensive conscription system that grants relatively few exceptions, these data cover 79% of the population of Finnish men born between 1962 and 1976 (n = 419,523). We also have test data for three additional cohorts born between 1977 and 1979 who took the personality test at the local draft board. However, these test results may not be directly comparable with earlier cohorts due to differences in the testing environment. The test score data have been linked with information on later life income and demographic background variables derived from administrative registers and population censuses. We present the data in more detail in Materials and Methods.

In comparison with earlier work, our test score data have both strengths and weaknesses. The main strength is that we observe a large and stable fraction of Finnish men over birth cohorts and that the test items remained unchanged during the period we examine. This facilitates the interpretation of changes in test results across cohorts. The most serious weakness of our data is that it does not include women. We also do not have test results for those men who chose to do the civilian service or were exempted from service due to medical reasons.

Another important limitation of the FDF personality test is that its scales do not directly correspond to standard personality scales and it has not been validated in a peer-reviewed journal. The FDF test measures eight traits (see legend of Fig. 1A). We conducted an online test using a short version of the test to see how these scales relate to the widely used Five-Factor Model (FFM) (see SI Appendix for details). The results from our convenience sample (n = 231) suggest that the FDF scales capture three of the FFM scales (extraversion, conscientiousness, and neuroticism) but not agreeableness and openness.

Fig. 1. Average scores for measures of (A) personality traits and (B) cognitive ability by birth year for native-born military conscripts in Finland. All scores are depicted in base year SDs, with base year means normalized at zero. The break in personality test scores reflects a change in test administration.

Conclusions We find a Flynn effect for personality—that is, a secular rise in personality traits that are associated with higher earnings. The fact that the trend is positive is clear from the way distributions of test scores shift up across birth cohorts. Various methods of quantifying the economic importance of these changes all point toward the trend in personality being similar in magnitude and economic importance to the rise in cognitive abilities. The trends in personality are also similar across levels of cognitive ability and across demographic subgroups. Our results on traits related to extraversion (i.e., sociability and activity–energy) are consistent with studies reporting increasing levels of extraversion (16, 24⇓–26). Our findings for conscientiousness-related traits are in agreement with findings from freshman psychology students at the University of Amsterdam between 1982 and 2007 (25) and from the Baltimore Longitudinal Study of Aging between 1989 and 2004 (43). We also found increasing levels of self-confidence. This trend is in contrast to findings from the Monitoring the Future study (28) but is in agreement with cross-temporal meta-analysis of US college students (17). A positive trend has been reported for narcissism at least in the United States (18). We cannot distinguish self-confidence associated with narcissism from self-esteem; we can only see that this measure of self-confidence predicts high earnings for the person himself. Growing evidence suggests that the Flynn effect has ended and may have reversed in Western Europe (32, 33, 44⇓–46). The last three birth cohorts in our data coincide with the peak in cognitive test scores in Finland (31). There is no clear trend for personality scores between these cohorts, which suggests that the end of the Flynn effect could also be reflected in personality traits. However, the data on these three birth cohorts are not fully comparable with our main data, and thus, it is not possible to make strong conclusions from them. The causes of the Flynn effect are still unclear (5), and our data do not reveal the ultimate cause of the cohort trends in personality either. Of course, we cannot distinguish between birth year and year of test as causal factors behind the trends. However, we can rule out trends in personality traits being mere reflections of changes in broadly defined socioeconomic backgrounds. Nevertheless, trends in background variables are indeed favorable and explain about two-thirds of the rise in cognitive ability and one-third of the trends in personality.

Materials and Methods Psychological Testing in the FDF. FDF has tested all conscripts with a battery of psychological tests since 1955. Initially the test consisted of only a cognitive test that measured reading skills, mathematical skills, and logical reasoning skills. In 1982, the FDF introduced a personality test that measures eight personality traits. Test results are one of the criteria used in selecting conscripts to officer training. The validity of the test and its predictive power for successful military service have been evaluated in several internal reports of the FDF. The results of these (mainly unpublished) studies have been summarized and the test procedure described in detail in ref. 47. Only those who enter service take the tests; those who are exempted (e.g., on prior health grounds) and those who choose to do nonmilitary service do not take the test. Test results of professional military officers were retracted by the FDF. Administration of the Test. Both the cognitive test and the personality test are administered in the second week of military service. The tests are organized in standardized group-administered conditions at all FDF units. Between 1995 and 2000, the personality test was administered already at the call-up, on average 18 mo before entering the service. The purpose was to use the test scores in placement of conscripts already before they started their service. However, the results were not widely used for this purpose, and the FDF was concerned that test conditions at local draft boards were not sufficiently standardized. In 2001, the FDF reverted to testing conscripts at the start of service (47). The cognitive test has always been administered in the military service. The test is a 2-h paper-and-pencil test where conscripts are asked to choose a correct alternative from a list (cognitive ability test) or whether they agree or disagree with statements (personality test). Completed answer sheets are sent to the Finnish Defense Research Agency for optical scanning. The test leaflets were unchanged from 1982 to 2000 but have not been released by the FDF. In 2001, the personality test was revised, and both the content and the results of the new test remain classified. SI Appendix, Table S1 reports means and SDs for each test score by cohort, and SI Appendix, Figs. S2–S5 show the full distributions of the raw scores for both personality and cognitive test. Observed scores vary over the entire range of possible values. The distributions of cognitive test scores are roughly normal but those of personality test scores less so. Ceiling effects may cause attenuation of trends for measures of self-confidence and sociability. Content of the Personality Test. The test contains between 18 and 33 items for each of the eight personality traits. Altogether there are 218 statements with a response scale of yes/no. The scores are formed by summing up the number of statements to which a person agrees (or, in case of reverse-coded statements, disagrees with). We observe the raw scores but not individual items. Internal reliability varies between 0.6 and 0.9 by trait; average Cronbach alpha is 0.75 (47). Self-confidence measures the person’s self-esteem and beliefs about his abilities (32 items; e.g., whether the person feels to be as good and able as others and can meet other people’s expectations). Sociability measures the person’s level of gregariousness and preference for socializing with others (33 items; e.g., whether the person likes to host parties and not withdraw from social events). Leadership motivation measures how much the person prefers to take charge in groups and influence other people; it includes 30 items. Activity–energy measures how much the person exerts physical effort in everyday activities and how quickly the person prefers to execute activities (28 items; e.g., whether the person tends to work fast and vigorously and prefers fast-paced work). Achievement striving, dutifulness, and deliberation all represent personality traits that are related to the higher order personality factor conscientiousness. Achievement striving measures how strongly the person wants to perform well and achieve important life goals (24 items; e.g., whether the person is prepared to make personal sacrifices to achieve success). Dutifulness measures how closely the person follows social norms and considers them to be important (18 items; e.g., whether the person would return money if given back too much change at a store). Deliberation measures how much the person prefers to think ahead and plan things before acting (26 items; e.g., whether the person prefers to spend money carefully). Masculinity measures the person’s occupational and recreational interests that are traditionally considered as masculine (27 items; e.g., whether the person would like to work as a construction manager). The FDF questionnaire also includes questions about mental health and questions assessing the validity of the answers. These include four mental health subscales from the Minnesota Multiphasic Personality Inventory (MMPI) but not other measures of normal personality. Of these variables we use only the lie score, which measures socially desirable responding—that is, attempts to give an overly favorable impression of one’s conduct. SI Appendix, Table S12 shows that trends in test scores cannot be attributed to changes in response validity as measured by the lie score. Content of the Cognitive Ability Test. Cognitive ability is measured with subtests of verbal, arithmetic, and visuospatial reasoning. Each subtest is composed of 40 multiple-choice questions in order of increasing difficulty. The test–retest reliabilities of the subtests vary between 0.76 and 0.88 (47). Verbal reasoning involves choosing synonyms or antonyms of a given word, selecting a word that belongs to the same category as a given word pair, choosing which word on a list does not belong in the group, and choosing similar relationships between two word pairs. Arithmetic reasoning involves completing a series of numbers that follow a certain pattern, solving short verbal problems, computing simple arithmetic operations, and choosing similar relationships between two pairs of numbers. The visuospatial reasoning task is a set of matrices containing a pattern problem with one removed part, and the participant needs to decide which of the given alternative figures completes the matrix; it is similar to Raven’s Progressive Matrices (48). Register Data. We use register data on the Finnish population compiled by Statistics Finland to obtain adult outcomes and background variables. These data provide information on basic demographics, family situation, living conditions, educational attainment, labor market status, and earnings of all Finnish residents. This information was linked to test scores by Statistics Finland using personal identification numbers and deidentified before being made available to researchers. Income data are from the Finnish Tax Authority. We measure earnings as the average annual earnings during ages 30–34, where “earnings” is the sum of labor market income and entrepreneurial income; we do not drop zeros. We deflate all values to 2010 Euros using the Statistics Finland CPI. In SI Appendix, we also use alternative income measures derived from the same data. Information about the identity of parents and brothers comes from the Finnish Population Register. Childhood municipality of residence comes from the Population Censuses of 1970, 1975, and 1980. We define childhood municipality as the municipality of residence in the first census after the year of birth. We drop those who are not observed at that point as they are likely to be foreign-born. We use Statistics Finland’s Statistical Grouping of Municipalities to divide municipalities into urban, semiurban, and rural. We define sibship size as the number of children with the same biological mother. Data on educational attainment are from the Register of Completed Education and Degrees maintained by Statistics Finland. These data contain information on the highest educational qualification that the individual has obtained and the date at which the individual received the qualification. We use it to obtain parents’ level of education and the eventual level of education for the conscripts. Permission to use the register data was approved by Statistics Finland (license TK-53-228-14) and by FDF (AJ23378). Personal data were processed following the regulations in Personal Data Act 523/1999 and the guidelines of Finnish Advisory Board on Research Integrity. The use of administrative data in scientific research does not require explicit consent from the subjects in Finland.

Acknowledgments We thank Kai Nyman and Kari Laitinen at the FDF for help in assisting with access to data and interpreting test scores and Annaliina Kotilainen for excellent research assistance. T.P., M.S., and R.U. were supported by Strategic Research Council at the Academy of Finland Grants 293445 and 303686. M.T. was supported by European Research Council Grant ERC-240970.

Footnotes Author contributions: M.J., T.P., M.S., M.T., and R.U. designed research, performed research, analyzed data, and wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1609994114/-/DCSupplemental.