Trial Overview

The trial protocol, which was published previously11 and is available with the full text of this article at NEJM.org, was approved by the relevant ethics committees and regulatory authorities in all the countries involved in the trial. Participants provided written informed consent.

The trial was conducted in accordance with the principles of the Declaration of Helsinki12 and Good Clinical Practice guidelines.13 The Robertson Centre for Biostatistics at the University of Glasgow was the trial data and biostatistics center.

The European Union FP7 provided primary financial support for the conduct of the trial. Supplies of levothyroxine and matching placebo were provided free of charge by Merck (Darmstadt, Germany). The funder, the trial sponsors (NHS Greater Glasgow and Clyde Health Board and University of Glasgow, United Kingdom; University College Cork, Ireland; Leiden University Medical Center, the Netherlands; and University of Bern and Bern University Hospital, Switzerland), and Merck played no role in the design, analysis, or reporting of the trial. The main sponsor (NHS Greater Glasgow and Clyde Health Board) contributed to the writing of the protocol. None of the sponsors had any involvement in the analysis or the reporting of the results. The authors vouch for the accuracy and completeness of the data and analyses reported and for the fidelity of the trial to the protocol.

Participants

Participants were identified from clinical laboratory and general practice databases and records. The inclusion criteria were an age of 65 years or more and persistent subclinical hypothyroidism, defined as an elevated thyrotropin level (4.60 to 19.99 mIU per liter) that was measured on at least two occasions that were 3 months to 3 years apart, with the free thyroxine level within the reference range. The main exclusion criteria for the trial were a current prescription for levothyroxine, antithyroid drugs, amiodarone, or lithium; thyroid surgery or receipt of radioactive iodine within the previous 12 months; dementia; hospitalization for a major illness or an elective surgery within the previous 4 weeks; an acute coronary syndrome (including myocardial infarction or unstable angina) within the previous 4 weeks; and terminal illness.11

Trial Design and Regimen

We conducted a randomized, double-blind, parallel-group trial of levothyroxine versus placebo. Patients underwent randomization in a 1:1 ratio, with stratification according to country, sex, and starting dose, with the use of randomly permuted blocks.

The active intervention started with levothyroxine at a dose of 50 μg daily (or 25 μg in patients with a body weight of <50 kg or with known coronary heart disease [previous myocardial infarction or symptoms of angina pectoris]) or matching placebo. Dose adjustment in the levothyroxine group was aimed to result in a thyrotropin level within the reference range (0.40 to 4.59 mIU per liter). Details regarding how the dose was adjusted and the mock adjustment in the placebo group are provided in the Supplementary Appendix, available at NEJM.org. All dose adjustments were generated and executed by means of computer without the intervention of a physician. The participants, investigators, and treating physicians were unaware of the results of thyrotropin measurements throughout the course of the trial.

Procedures and Outcomes

The two primary outcomes for the trial were the change from baseline to 12 months in the Thyroid-Related Quality-of-Life Patient-Reported Outcome measure (ThyPRO) Hypothyroid Symptoms score (4 items) and Tiredness score (7 items); each scale ranges from 0 to 100, with higher scores indicating more symptoms and tiredness, respectively.14 A recent systematic review recommended ThyPRO as the preferred measurement tool for the assessment of health-related quality of life in patients with benign thyroid disease.15 The ThyPRO and other instruments were administered in English, French, German, or Dutch as appropriate. We had initially planned for cardiovascular events and thyroid-specific quality of life to be the two primary outcomes. However, this plan was modified during the trial to thyroid-specific quality-of-life scores as the two primary outcomes and cardiovascular events as a secondary outcome when it became apparent that the trial would be underpowered for cardiovascular events owing to delays and difficulties in recruitment.11

The secondary outcomes included changes from baseline in generic health-related quality of life (as assessed by the EuroQoL [EQ] Group 5-Dimension Self-Report Questionnaire [EQ-5D]; scores on the EQ-5D descriptive index range from −0.59 to 1.00, and scores on the EQ visual-analogue scale range from 0 to 100, with higher scores indicating better quality of life),16 comprehensive thyroid-related quality of life (as assessed by the ThyPRO-39 score, a shorter version of the ThyPRO measure,17 at final follow-up only), hand-grip strength (as assessed by means of the Jamar isometric dynamometer, with the recorded score as the best of three measures in the dominant hand),18 executive cognitive function (as assessed with the letter–digit coding test, which indicates the speed of processing according to the number of correct responses in matching nine letters with nine digits in 90 seconds; minimum score, 0, with higher scores indicating better executive cognitive function; there is no maximum score),19 blood pressure (systolic and diastolic), weight, body-mass index, waist circumference, activities of daily living (as assessed by the Barthel Index of functional levels in activities of daily living, on a scale ranging from 0 to 20, with higher scores indicating better performance),20 the Instrumental Activities of Daily Living score (on a scale from 0 to 14, with higher scores indicating better performance in activities of daily living),21 and fatal and nonfatal cardiovascular events. The minimum follow-up was 1 year, and the maximum follow-up was 3 years.

Safety and Recording of Adverse Events

Adverse events were assessed, managed, recorded, reported, and analyzed in accordance with the Medicines for Human Use (Clinical Trials) Regulations 2004 (as amended). Adverse events of special interest included new atrial fibrillation, heart failure, fracture, and new diagnosis of osteoporosis. The score on the ThyPRO Hyperthyroid Symptoms scale was recorded as a measure of possible adverse effects (on a scale from 0 to 100, with higher scores indicating more symptoms; minimum clinically important difference has been estimated as 9 points).14

Statistical Analysis

The Hypothyroid Symptoms and Tiredness scores from the ThyPRO14 were the two primary outcomes, with the required P value for statistical significance split equally to each test (0.05/2=0.025 for each test). We assumed standard deviations for data at 1 year of 13.3 and 18.3 on the 100-point scales, respectively, after adjustment for baseline values. These calculations provided the trial with 80% power to detect a change with levothyroxine treatment (vs. placebo) of 3.0 points on the Hypothyroid Symptoms score and 4.1 points on the Tiredness score with our revised maximum expected number of recruited participants of 750, and with changes of 3.5 points and 4.9 points, respectively, with our minimum expected number of 540 participants. Justification for these power calculations is provided in the trial protocol.11

The methods of analysis of the continuous efficacy outcomes involving measurements at baseline and follow-up were analyzed at each time point for the comparison of the two trial groups, with adjustment for stratification variables (country, sex, and starting dose of levothyroxine) and baseline levels of the same variable with the use of multivariate linear regression (see the Supplementary Appendix). The efficacy and safety analyses were carried out in a modified intention-to-treat population, which included participants with data on the outcome of interest. Patients who discontinued the trial regimen continued to be followed for the modified intention-to-treat analysis. These analyses were supported with sensitivity analyses that used mixed-effects models and multiple imputations for missing data. The primary and secondary outcomes at 12 months were also analyzed in prespecified subgroups according to sex and baseline thyrotropin level.11 Analyses were repeated in the per-protocol population, which included participants who continued to take the trial regimen per the trial protocol.