Study Oversight

This study was approved by the College of Medical, Veterinary, and Life Sciences Ethics Committee, University of Glasgow. The protocol, which has been previously published,21 and data-governance procedures were reviewed and approved by National Health Service (NHS) Scotland Public Benefit and Privacy Panel for Health and Social Care. All data from health records were deidentified by an independent information analyst who worked in the electronic Data Research and Innovation Service of NHS Scotland; participant-level consent was not required. The results from this study were analyzed and reported in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines.22 The authors vouch for the accuracy and completeness of the data and for the fidelity of the study to the protocol.

Inclusion Criteria

We accessed electronic health records to obtain data on death certification and medications that are typically prescribed for the treatment of dementia in a cohort of former professional soccer players. Former professional soccer players were identified from databases of all Scottish professional soccer players23,24 that were compiled from the archives of the Scottish Soccer Museum and the individual league clubs. Available data included the full name and date of birth of the players and career information, including dates of first signing and retirement, number of match appearances, and player position. These databases were merged, and duplicate entries deleted. Persons born before January 1, 1977, were eligible for inclusion in the study. All the former soccer players were male.

Probabilistic matching was applied to the full name and date of birth of former soccer players to link them to their unique Community Health Index numbers. Variables such as date of birth and full name were compared between former players and Community Health Index data sets, and a score was attached to each matched variable that reflected the level of agreement. The scores for each variable were summed to generate an overall score, which correlated with the likelihood of the two records belonging to the same person. An automated computer program was used to match former players with controls from the general population, in a 1:3 ratio, on the basis of sex, year of birth, and degree of social deprivation. To match the degree of social deprivation between cohorts, we used records from the NHS Information Services Division, which contain the last known postal code of residence for all persons. For each area of residence, data on social deprivation were derived with the use of postal code–level data on income, employment, health, education, housing, access to local amenities, and crime and were categorized into quintiles on the basis of the Scottish Index of Multiple Deprivation (SIMD), with the quintiles ranging from 1 (most deprived) to 5 (least deprived).25

The Community Health Index number can be linked to death certificates and records in the national Prescribing Information System. Information recorded on death certificates includes the date of death and the primary and contributory causes of death, coded according to the International Classification of Diseases, 9th Revision and 10th Revision (ICD-9 and ICD-10). The Prescribing Information System documents every prescription dispensed in the community and has included complete Community Health Index data since 2009; medications are coded according to the British National Formulary.26 The ICD-9 and ICD-10 codes that were used to identify outcomes as death with neurodegenerative disease listed as the primary or a contributory cause (classified as all neurodegenerative diseases, dementia not otherwise specified, Alzheimer’s disease, non-Alzheimer’s dementias, motor neuron disease, and Parkinson’s disease) and the remaining most common causes of death in the Scottish adult male population (diseases of the circulatory system [classified as all diseases of the circulatory system, ischemic heart disease, and stroke or cerebrovascular disease], diseases of the respiratory system, or cancer [classified as any cancer or lung cancer]) are provided in Table S1 in the Supplementary Appendix, available with the full text of this article at NEJM.org. Because no ICD-9 or ICD-10 codes exist for CTE or its synonym, dementia pugilistica, these diagnoses were not identifiable from death certificate records. The medications included in the analysis of medications for dementia are listed in Section 4.11 of the British National Formulary and in Table S2. All analyses included data up to December 31, 2016, and the database interrogation was performed on December 10, 2018.

Statistical Analysis

We used Cox proportional-hazards regression to model time to death and Schoenfeld residuals to test the assumption of proportional hazards.27 When the assumption of proportional hazards did not hold, a time-varying model was used to derive hazard ratios over different periods of follow-up.28 Age was used as the time covariate, with follow-up from age 40 years to the date of data censoring, which was either the date of death or the end of follow-up (December 31, 2016), whichever occurred first. In a sensitivity analysis, the outcome of death from neurodegenerative disease was subjected to a competing-risks regression analysis to ascertain whether the estimated hazard ratio was sensitive to the competing risks of death from ischemic heart disease and death from any cancer.29 We tested the assumption of proportional subhazards (hazard ratios adjusted for competing risks of death from ischemic heart disease and death from any cancer) by including an interaction term between time of analysis and a dummy variable for former soccer player status and assessing whether the interaction was significant (P<0.05). The mortality models were repeated in the analysis of the subgroups of former players defined according to player position (outfield player or goalkeeper). Because prescribing outcomes were available only from 2009 onward, we applied a nested case–control design in which conditional logistic regression was used to analyze the whole cohort (matched data)30 and standard logistic regression was used to analyze the subgroups of former players defined according to player position. All statistical analyses were performed with the use of Stata/MP software, version 14.1 (StataCorp). Two-sided P values of less than 0.05 were considered to indicate statistical significance.