We sought to assess the impact of the fourth year on clinical performance during internal medicine internship. We examined 2 consecutive classes of interns to determine how their fourth-year experiences, including the number and intensity of courses, related to their multi-source assessments of performance based on the Accreditation Council for Graduate Medical Education (ACGME) educational milestones. 13

At the same time, little attention has been paid to the composition and quality of experiences during a fourth year of medical school, which represents the last opportunity to expand clinical skills and knowledge before learners become residents. 10 , 11 The subinternship experience, considered the cornerstone of the fourth year, lacks national standards for content and assessment. 2 , 11 For many medical schools, the medical subinternship is the only requisite course in the fourth-year curriculum. At a majority of schools, the remainder of the year is largely unstructured, with students choosing from a variety of clinical and nonclinical electives. 12

Many undergraduate medical education programs are redesigning their curriculum and assessment methods to meet a changing practice landscape. 1 Educators largely have focused on the first 3 years of the traditional 4-year undergraduate curriculum to address concerns about the cost of medical education and declining interest in primary care. 2 , 3 Some medical schools have eliminated the fourth year entirely, while others believe it is critical to professional development and, in support, cite declining board certification performance. 4 – 6 Residency program directors also raise concerns that interns lack self-reflective skills, leading to underdeveloped professionalism, weak medical knowledge, and lack of preparedness to manage medical emergencies. 7 , 8 To address these issues, some advocate for a more rigorous undergraduate experience. 9

We categorized the intensity of course loads in multiple complementary ways by examining the proportion of time spent in intensive or nonintensive activities. We also assessed the number of courses taken, adjusting for intensive and nonintensive coursework. We examined the individual course type as described above, and treated the proportion and number of intensive courses as linear variables (tests of curvature using quadratic terms were not significant). We also present deciles for illustrative purposes.

To account for the multiple questionnaires within-intern and within-rater, we performed all analyses using generalized estimating equations, with the individual item as the unit of analysis. We estimated odds ratios (ORs) for the likelihood of the primary outcome, excellent scores, using binomial error structures, a log link, and an exchangeable correlation matrix, with hierarchical clustering by both intern and rater. For robustness, we estimated OR for the likelihood of poor scores similarly, using a logit rather than log link. In all cases, we constructed both models that only included the number of intensive and nonintensive courses and models that further adjusted for outlined covariates.

Evaluations use a 5- or 9-point scale, depending on type and specific question (see online supplemental material for sample evaluations). Based on a strong ceiling effect of the total distribution of scores with clustering at maximal values, we defined “excellent scores” as an 8 or 9 on the 9-point scale and 5 on the 5-point scale. For robustness and ease of interpretation, we also established an outcome of a “poor score” on any individual item as 6 or less (9-point scale) and 3 or less (5-point scale).

The IMRP maintains assessments of all residents using New Innovations, a confidential online assessment tool. Evaluations are based on the ACGME's 6 competencies and milestones. 13 After most clinical rotations, attendings, residents, fellows, medical students, and nurses evaluate interns using questionnaires specific for the rotation and evaluator.

The program receives final medical school transcripts for most interns. One author (N.D.) deidentified the transcripts and assigned identification numbers to protect anonymity prior to coding. The transcripts were coded and entered into a REDCap (Research Electronic Data Capture, Vanderbilt University, Nashville, TN) database. The authors a priori defined intensive clinical courses as experiences with a higher order of clinical responsibility and knowledge than the average course. These included subinternships of any variety, intensive care, and surgical and emergency medicine rotations. Research, less relevant patient care specialties (pathology and radiology), didactic courses, and language courses were defined as not clinically intensive. The authors reviewed all categorizations; the 3 instances of difficulty interpreting transcript information were resolved by consensus.

We performed an additional sensitivity analysis of the 2350 evaluations; 532 (23%) were uniformly excellent. The RR of such an evaluation for each additional intensive course was 1.13 (95% CI 1.05–1.21, P = .001). The corresponding RR for each nonintensive course was 0.97 (95% CI 0.94–1.01, P = .01). These 2 estimates differed significantly from each other ( P < .001).

To determine the robustness of these associations, we performed a sensitivity analysis using poor scores. The unadjusted OR for obtaining a poor score per intensive course was 0.92 (95% CI 0.84–1.01, P = .07), whereas the OR per nonintensive course was 1.04 (95% CI 1.00–1.08, P = .04). These 2 ORs were significantly different from one another ( P = .02); these differences were similar but were no longer statistically significant after adjustment for demographics ( P = .12). Similarly, there was a persistent decrease in the OR of a poor evaluation with increasing time spent taking intensive course work ( P < .001; figure ).

The positive influence of intensive course work was seen in all competencies except professionalism, and our measurement of global assessment independent of any of the ACGME Milestones ( table 4 ).

A second analysis accounting for variable course lengths (median 4 weeks) assessed the relationship of percentage time in intensive courses with evaluations. The upper chart in the figure depicts the adjusted RR of obtaining an excellent score pursuing intensive course work, with the referent being the lowest decile of intensive course work (decile 1). To determine if the positive association of intensive course work with performance was driven by any of the individual components, we determined the adjusted RR of excellent evaluations based on individual course types. No single type of intensive course work accounted for our findings ( table 3 ).

When examined continuously, the relative risk (RR) of an excellent score per intensive course was 1.05 (95% CI 1.03–1.07, P < .001), while the corresponding RR per nonintensive course was 0.99 (95% CI 0.98–1.00, P = .03); these RRs differed significantly ( P < .001). When adjusted for demographics, the RR of an excellent score was 1.05 (95% CI 1.03–1.08, P < .001) per intensive course and 1.00 (95% CI 0.98–1.01, P = .40) per nonintensive course; these again differed significantly from each other ( P < .001).

Of 115 interns eligible for participation in the study, we were able to obtain 83 medical school transcripts, of which 5 were not interpretable and excluded. The demographics are summarized in table 1 ; 3 interns held additional degrees (PhD/MS). A summary of total completed fourth-year courses and breakdown of course type is found in table 2 . A total of 69 641 individual points from 2350 completed evaluations were available, with a median of 30 evaluations per intern (range 19–56). Of these, 42 203 (61%) assessments met the criteria for excellent and 5724 (8%) for poor.

Discussion

In this study of medical interns in 1 large residency program, the quantity and intensity of medical school courses taken in the fourth year had a small, but significant and dose-dependent, association with clinical performance during internship. This effect was seen across all ACGME competencies except professionalism, and persisted when corrected for potential confounders. The association of intensive courses with better performance differed significantly from the corresponding association with nonintensive courses and strengthens the plausibility that intensive course work has a measurable impact on intern performance.

The observed effect from intensive courses was robust and seen for all types of evaluations, for most ACGME clinical competencies, and for global performance, where the relationship was the strongest. The 1 exception was professionalism. Other studies suggest a fundamental difference between professionalism and other competencies, theorizing it is more difficult to teach, correct, and change over time.14–17

Our program uses robust assessment tools based on the ACGME core competencies that considers input from students to attending physicians. We analyzed nearly 70 000 points of assessment, which afforded the power to detect subtle differences in the performance of high-performing medical interns. Our data not only support the argument that the fourth year should be maintained, but also that its clinical intensity should be strengthened to produce “clinically ready” graduates.

We do not believe that nonintensive course work, as we have defined here, is without value. There is certainly benefit to research and nonclinical specialty exposure. In this regard, our performance measures may not capture the value of such courses, as we focused on intern clinical performance—which has the most direct relationship to courses—but not on long-term success or satisfaction. Nonetheless, medical students could reasonably be advised to take intensive courses during their fourth year to improve their clinical performance during internship.

Our study has limitations. We studied a single academic residency program, and results may not generalize to other programs. However, the interns in this study represented 46 different medical schools, enhancing our generalizability. In our analysis, poor scores were rarely given for intern performance. We observed a low median number of fourth-year courses taken by our intern classes. Without comparable information from other programs, we cannot necessarily extrapolate our results to the incremental value of intensive courses when students take a larger number of courses.

Interns at BIDMC are academically talented, with high levels of AOA membership and above-average USMLE Step 1 scores.18 This had the expected consequence of a ceiling effect in evaluations, minimizing the variability of performance within the cohort. This tends to reduce our ability to detect differences among interns, leading to a possible underestimate of benefit.

Another limitation of observational studies like ours is the ability to infer causality in the presence of confounding. Students with strong clinical backgrounds may disproportionately select demanding fourth-year programs. Although we controlled for several potential markers of performance (AOA, USMLE Step 1 score, and reputation of medical school), none of these factors combined or individually materially confounded our primary estimates of association. We were limited to these proxy markers of achievement; we did not have access to USMLE Step 2 scores, and the heterogeneity of medical school grading precluded the use of honors. Even more subjective concepts, such as medical student motivation, are not readily measured by any routinely used instrument. Ultimately, the only way to fully control for all forms of identified and unidentified bias would be to perform a randomized trial, which, in this setting, is unlikely to occur. Of note, our results do identify variables of potential importance for clinical training that would be helpful to program directors.