Trial Oversight

The NLST, a randomized trial of screening with the use of low-dose CT as compared with screening with the use of chest radiography, was a collaborative effort of the Lung Screening Study (LSS), administered by the NCI Division of Cancer Prevention, and the American College of Radiology Imaging Network (ACRIN), sponsored by the NCI Division of Cancer Treatment and Diagnosis, Cancer Imaging Program. Chest radiography was chosen as the screening method for the control group because radiographic screening was being compared with community care (care that a participant usually receives) in the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial (ClinicalTrials.gov number, NCT00002540).11 The NLST was approved by the institutional review board at each of the 33 participating medical institutions. The study was conducted in accordance with the protocol; both the protocol and the statistical analysis plan are available with the full text of this article at NEJM.org.

Participants

We enrolled participants from August 2002 through April 2004; screening took place from August 2002 through September 2007. Participants were followed for events that occurred through December 31, 2009 (Fig. 1 in the Supplementary Appendix, available at NEJM.org).

Eligible participants were between 55 and 74 years of age at the time of randomization, had a history of cigarette smoking of at least 30 pack-years, and, if former smokers, had quit within the previous 15 years. Persons who had previously received a diagnosis of lung cancer, had undergone chest CT within 18 months before enrollment, had hemoptysis, or had an unexplained weight loss of more than 6.8 kg (15 lb) in the preceding year were excluded. A total of 53,454 persons were enrolled; 26,722 were randomly assigned to screening with low-dose CT and 26,732 to screening with chest radiography. Previously published articles describing the NLST10,12 reported an enrollment of 53,456 participants (26,723 in the low-dose CT group and 26,733 in the radiography group). The number of enrolled persons is now reduced by 2 owing to the discovery of the duplicate randomization of 2 participants.

Participants were enrolled at 1 of the 10 LSS or 23 ACRIN centers. Before randomization, each participant provided written informed consent. After the participants underwent randomization, they completed a questionnaire that covered many topics, including demographic characteristics and smoking behavior. The ACRIN centers collected additional data for planned analyses of cost-effectiveness, quality of life, and smoking cessation. Participants at 15 ACRIN centers were also asked to provide serial blood, sputum, and urine specimens. Lung-cancer and other tissue specimens were obtained at both the ACRIN and LSS centers and were used to construct tissue microarrays. All biospecimens are available to researchers through a peer-review process.

Screening

Participants were invited to undergo three screenings (T0, T1, and T2) at 1-year intervals, with the first screening (T0) performed soon after the time of randomization. Participants in whom lung cancer was diagnosed were not offered subsequent screening tests. The number of lung-cancer screening tests that were performed outside the NLST was estimated through self-administered questionnaires that were mailed to a random subgroup of approximately 500 participants from LSS centers annually. Sample sizes were selected to yield a standard error of 0.025 for the estimate of the proportion of participants undergoing lung-cancer screening tests outside the NLST in each group. For participants from ACRIN centers, information on CT examinations or chest radiography performed outside the trial was obtained, but no data were gathered on whether the examinations were performed as screening tests.

All screening examinations were performed in accordance with a standard protocol, developed by medical physicists associated with the trial, that specified acceptable characteristics of the machine and acquisition variables.10,13,14 All low-dose CT scans were acquired with the use of multidetector scanners with a minimum of four channels. The acquisition variables were chosen to reduce exposure to an average effective dose of 1.5 mSv. The average effective dose with diagnostic chest CT varies widely but is approximately 8 mSv.10,13,14 Chest radiographs were obtained with the use of either screen-film radiography or digital equipment. All the machines used for screening met the technical standards of the American College of Radiology.10 The use of new equipment was allowed after certification by medical physicists.

NLST radiologists and radiologic technologists were certified by appropriate agencies or boards and completed training in image acquisition; radiologists also completed training in image quality and standardized image interpretation. Images were interpreted first in isolation and then in comparison with available historical images and images from prior NLST screening examinations. The comparative interpretations were used to determine the outcome of the examination. Low-dose CT scans that revealed any noncalcified nodule measuring at least 4 mm in any diameter and radiographic images that revealed any noncalcified nodule or mass were classified as positive, “suspicious for” lung cancer. Other abnormalities such as adenopathy or effusion could be classified as a positive result as well. Abnormalities suggesting clinically significant conditions other than lung cancer also were noted, as were minor abnormalities. At the third round of screening (T2), abnormalities suspicious for lung cancer that were stable across the three rounds could, according to the protocol, be classified as minor abnormalities rather than positive results.

Results and recommendations from the interpreting radiologist were reported in writing to the participant and his or her health care provider within 4 weeks after the examination. Since there was no standardized, scientifically validated approach to the evaluation of nodules, trial radiologists developed guidelines for diagnostic follow-up, but no specific evaluation approach was mandated.

Medical-Record Abstraction

Medical records documenting diagnostic evaluation procedures and any associated complications were obtained for participants who had positive screening tests and for participants in whom lung cancer was diagnosed. Pathology and tumor-staging reports and records of operative procedures and initial treatment were also obtained for participants with lung cancer. Pathology reports were obtained for other reported cancers to exclude the possibility that such tumors represented lung metastases. Histologic features of the lung cancer were coded according to the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3),15 and the disease stage was determined according to the sixth edition of the Cancer Staging Manual of the American Joint Committee on Cancer.16 At ACRIN sites, additional medical records were also obtained for a number of substudies, including studies of health care utilization and cost-effectiveness.10

Vital Status

Participants completed a questionnaire regarding vital status either annually (LSS participants) or semiannually (ACRIN participants). The names and Social Security numbers of participants who were lost to follow-up were submitted to the National Death Index to ascertain probable vital status. Death certificates were obtained for participants who were known to have died. An end-point verification team determined whether the cause of death was lung cancer. Although a distinction was made between a death caused by lung cancer and a death that resulted from the diagnostic evaluation for or treatment of lung cancer, the deaths from the latter causes were counted as lung-cancer deaths in the primary end-point analysis. The members of the team were not aware of the group assignments (see Section 2 in the Supplementary Appendix).

Statistical Analysis

The primary analysis was a comparison of lung-cancer mortality between the two screening groups, according to the intention-to-screen principle. We estimated that the study would have 90% power to detect a 21% decrease in mortality from lung cancer in the low-dose CT group, as compared with the radiography group. Secondary analyses compared the rate of death from any cause and the incidence of lung cancer in the two groups.

Event rates were defined as the ratio of the number of events to the person-years at risk for the event. For the incidence of lung cancer, person-years were measured from the time of randomization to the date of diagnosis of lung cancer, death, or censoring of data (whichever came first); for the rates of death, person-years were measured from the time of randomization to the date of death or censoring of data (whichever came first). The latest date for the censoring of data on incidence of lung cancer and on death from any cause was December 31, 2009; the latest date for the censoring of data on death from lung cancer for the purpose of the primary end-point analysis was January 15, 2009. The earlier censoring date for death from lung cancer was established to allow adequate time for the review process for deaths to be performed to the same, thorough extent in each group. We calculated the confidence intervals for incidence ratios assuming a Poisson distribution for the number of events and a normal distribution of the logarithm of the ratio, using asymptotic methods. We calculated the confidence intervals for mortality ratios with the weighted method that was used to monitor the primary end point of the trial,17 which allows for a varying rate ratio and is adjusted for the design. The number needed to screen to prevent one death from lung cancer was estimated as the reciprocal of the reduction in the absolute risk of death from lung cancer in one group as compared with the other, among participants who had at least one screening test. The analyses were performed with the use of SAS/STAT18 and R19 statistical packages.

Interim analyses were performed to monitor the primary end point for efficacy and futility. The analyses involved the use of a weighted log-rank statistic, with weights increasing linearly from no weight at randomization to full weight at 4 years and thereafter. Efficacy and futility boundaries were built on the Lan–DeMets approach with an O'Brien–Fleming spending function.20 Interim analyses were performed annually from 2006 through 2009 and semiannually in 2010.

An independent data and safety monitoring board met every 6 months and reviewed the accumulating data. On October 20, 2010, the board determined that a definitive result had been reached for the primary end point of the trial and recommended that the results be reported.21 The board's decision took into consideration that the efficacy boundary for the primary end point had been crossed and that there was no evidence of unforeseen screening effects that warranted acting contrary to the trial's prespecified monitoring plan. The NCI director accepted the recommendation of the data and safety monitoring board, and the trial results were announced on November 4, 2010.