Study Setting and Participants

A multiple-choice survey was administered immediately after the January 2018 American Board of Surgery In-Training Examination (ABSITE), an annual computer-based examination taken by all residents training in general surgery programs accredited by the Accreditation Council for Graduate Medical Education (ACGME; see the survey in the Supplementary Appendix, available with the full text of this article at NEJM.org).17,18 The survey was preceded by a statement explaining that the purpose of the survey was research, that data would be deidentified before analysis, and that program directors and chairs would not have access to the responses. There were no incentives or disincentives to participate.18-20

Survey responses were collected by the American Board of Surgery and were deidentified before being transferred to Northwestern University for analysis.18,20 Excluded from all analyses were 837 residents who were clinically inactive (i.e., were taking dedicated time off for conducting research), 2 residents who were training in one program that averaged fewer than 1 resident per postgraduate year, and 4 residents whose surveys were missing responses to the burnout questions. Two programs that had no female residents were excluded from program-level analyses. The Northwestern University institutional review board office reviewed this study, including the survey and instructions to residents, and determined that it did not meet the federal definition of human-subjects research and therefore did not require full review and approval by the institutional review board.

Survey Development

The 2018 survey items were adapted from previously published and validated instruments.2,18,20-22 Pretest cognitive interviews were conducted with general surgery residents from multiple institutions to assess the overall coherence, balance, and clarity of the survey. The survey was then iteratively revised and retested in a larger sample of general surgery residents from multiple institutions.18,20

Mistreatment Exposures

Respondents were asked to report the frequency (categorized as never, a few times a year, a few times a month, a few times a week, or daily), since the beginning of their residencies, with which they were subject to discrimination based upon their self-identified gender; racial discrimination; discrimination based on past, present, or expected pregnancy, childcare needs, or both; sexual harassment; physical abuse; and verbal or emotional abuse. No definitions of these exposures were provided. Residents who answered in the affirmative were then asked to identify the primary source of the mistreatment: patients or patients’ families, attending surgeons, other residents or fellows, administrators, or nurses or support staff. Mistreatment was categorized in several ways. Because perceived abuse, discrimination, and harassment were highly correlated with one another, we constructed a single composite indicator for primary comparisons. The composite represents the maximum reported frequency of any of the mistreatment exposures (discrimination on the basis of gender, race, or pregnancy or childcare; physical or verbal abuse; and sexual harassment). Residents were then categorized by frequency of exposure to mistreatment: no exposure, exposures a few times per year, or exposures a few times or more per month. Each type of exposure was also dichotomized (never vs. any) and modeled individually.

Main Outcome Measures

Symptoms of burnout were assessed with the use of the modified, abbreviated Maslach Burnout Inventory–Human Services Survey for Medical Personnel (aMBI), which examines emotional exhaustion and depersonalization with three questions each.23,24 To facilitate interpretation and presentation of the data, residents were divided into those who reported at least weekly occurrence of any of the six items in the aMBI and those who reported that symptoms occurred less than once a week.5 Sensitivity analyses were performed with other burnout definitions.25

Suicidal thoughts were assessed with the question, “During the past 12 months, have you had thoughts of taking your own life?”2,26,27 Residents who responded in the affirmative during the online survey were immediately provided with information on the screen urging them to reach out to their program directors, make use of online resources, or contact the National Suicide Prevention Lifeline. No active outreach was possible because all data were deidentified and confidentiality had been assured as a precondition of survey completion.

Resident and Program Characteristics

We obtained information on the following characteristics of the residents: gender, clinical postgraduate year (PGY, categorized as 1, 2–3, or 4–5), and relationship status (married or in a relationship, not in a relationship, or divorced or widowed). Program characteristics for which we obtained information included size (total number of surgical residents, divided into quartiles: <26, 26 to 37, 38 to 51, or >51 residents per program), type (academic, community, or military), and geographic location (Northeast, Southeast, Midwest, Southwest, or West). Residents were also asked to report the number of months during which they had violated the 80-hours-per-week (averaged over a month) duty-hour requirement in the previous 6 months (0, 1 or 2, or ≥3).

Statistical Analysis

Multivariable logistic-regression models were used to examine all available demographics of the residents (e.g., gender and marital status) and program characteristics (e.g., geographic location) associated with burnout and suicidal thoughts, both excluding and including mistreatment exposures (i.e., discrimination, abuse, and sexual harassment). The primary models examined the association of the composite mistreatment variable with burnout and with suicidal thoughts. Each mistreatment exposure variable was also modeled individually to examine associations with burnout and suicidal thoughts. All models were estimated with robust standard errors accounting for resident clustering within programs. Missing data were rare (<1%) and were excluded from the analyses. Effect modification between mistreatment and gender was explored by serial addition of multiplicative interaction terms. Several sensitivity analyses were performed, including those that used different thresholds for mistreatment exposures and those that used different definitions of burnout (e.g., continuous and different dichotomizations) to assess the robustness of the results.

Program-level values were calculated as the percentage of residents in each program who reported gender discrimination, racial discrimination, verbal or emotional abuse, physical abuse, sexual harassment, and duty-hour violations. The extent to which different mistreatment exposures occurred concurrently at the program level (e.g., whether programs with high rates of gender discrimination also had high rates of sexual harassment) was examined with weighted kappa statistics.

Point estimates are reported with confidence intervals, which have not been adjusted for multiple comparisons. All statistical analyses were performed with Stata software, version 14.1 (StataCorp). There was no prespecified statistical analysis plan, but an a priori hypothesis was specified at the time of survey development.