Key Points

Question What is the effect of heightened vigilance during unannounced hospital accreditation surveys on the quality and safety of inpatient care?

Findings In an observational analysis of 1984 unannounced hospital surveys by The Joint Commission, patients admitted during the week of a survey had significantly lower 30-day mortality than did patients admitted in the 3 weeks before or after the survey. This change was particularly pronounced among major teaching hospitals; no change in secondary safety outcomes was observed.

Meaning Changes in practice occurring during periods of surveyor observation may meaningfully improve quality of care.

Abstract

Importance In the United States, hospitals receive accreditation through unannounced on-site inspections (ie, surveys) by The Joint Commission (TJC), which are high-pressure periods to demonstrate compliance with best practices. No research has addressed whether the potential changes in behavior and heightened vigilance during a TJC survey are associated with changes in patient outcomes.

Objective To assess whether heightened vigilance during survey weeks is associated with improved patient outcomes compared with nonsurvey weeks, particularly in major teaching hospitals.

Design, Setting, and Participants Quasi-randomized analysis of Medicare admissions at 1984 surveyed hospitals from calendar year 2008 through 2012 in the period from 3 weeks before to 3 weeks after surveys. Outcomes between surveys and surrounding weeks were compared, adjusting for beneficiaries’ sociodemographic and clinical characteristics, with subanalyses for major teaching hospitals. Data analysis was conducted from January 1 to September 1, 2016.

Exposures Hospitalization during a TJC survey week vs nonsurvey weeks.

Main Outcomes and Measures The primary outcome was 30-day mortality. Secondary outcomes were rates of Clostridium difficile infections, in-hospital cardiac arrest mortality, and Patient Safety Indicators (PSI) 90 and PSI 4 measure events.

Results The study sample included 244 787 and 1 462 339 admissions during survey and nonsurvey weeks with similar patient characteristics, reason for admission, and in-hospital procedures across both groups. There were 811 598 (55.5%) women in the nonsurvey weeks (mean [SD] age, 72.84 [14.5] years) and 135 857 (55.5%) in the survey weeks (age, 72.76 [14.5] years). Overall, there was a significant reversible decrease in 30-day mortality for admissions during survey (7.03%) vs nonsurvey weeks (7.21%) (adjusted difference, −0.12%; 95% CI, −0.22% to −0.01%). This observed decrease was larger than 99.5% of mortality changes among 1000 random permutations of hospital survey date combinations, suggesting that observed mortality changes were not attributable to chance alone. Observed mortality reductions were largest in major teaching hospitals, where mortality fell from 6.41% to 5.93% during survey weeks (adjusted difference, −0.38%; 95% CI, −0.74% to −0.03%), a 5.9% relative decrease. We observed no significant differences in admission volume, length of stay, or secondary outcomes.

Conclusions and Relevance Patients admitted to hospitals during TJC survey weeks have significantly lower mortality than during nonsurvey weeks, particularly in major teaching hospitals. These results suggest that changes in practice occurring during periods of surveyor observation may meaningfully affect patient mortality.

Introduction

Medical error is a significant cause of preventable mortality in US hospitals.1 To ensure compliance with high standards for patient safety, The Joint Commission (TJC) performs unannounced on-site inspections (ie, surveys) at US hospitals every 18 to 36 months as an integral part of their accreditation process.2 During these week-long inspections, TJC surveyors closely observe a broad range of hospital operations, focusing on high-priority patient safety areas, such as environment of care, documentation, infection control, and medication management.3 The stakes for performance during a TJC survey are high: loss of accreditation or a citation in the review process can adversely affect a hospital’s reputation and presage public censure or closure.4-6 This possibility can be especially important for large academic medical centers, whose reputation provides significant financial leverage in their local market.7 Hospital staff are keenly aware of their behavior being observed and reflecting on their institution as a whole during TJC surveys.8 This pressure has created an entire category of staff training in many hospitals around “survey readiness.”9,10

The visible nature of these surveys puts hospitals into a state of high vigilance and activates survey readiness training, called a “code J” by one observer.8 This phenomenon closely resembles the Hawthorne effect, an observation in the social sciences that research participants change their behavior because of the awareness of being monitored.11 The Hawthorne effect has been well described in various health care settings, including antibiotic prescribing,12,13 hand hygiene,14 and outpatient process quality in low-resource settings,15 and has been cited in a TJC report as a significant barrier to accurate observation of hand hygiene practices.14 In addition to the Hawthorne effect, there is a robust literature in economics describing how audits or monitoring of employees can lead to improved performance.16,17 There is little doubt that, for many hospitals, the monitoring of staff by TJC surveyors motivates changes in staff behavior to reflect expectations of what hospitals want surveyors to see.8-10,18

To our knowledge, no research has addressed whether potential changes in behavior and heightened vigilance during a TJC survey are associated with changes in patient outcomes. Several mechanisms could plausibly create such an effect, including increased attention to infection control leading to reduced hospital-associated infections and heightened awareness of medication management leading to reduced adverse medication events. If the presence of TJC surveyors affected patient outcomes, it would imply that the survey-week scramble to improve staff compliance with surveyor expectations has a significant safety impact worth further exploration.

We examined the association of TJC survey visits with patient safety outcomes via the quasi-random assignment of admissions between unannounced survey weeks and nonsurvey weeks within hospitals. We focused on inpatient safety–related outcomes that could be plausibly affected by attention to survey-relevant aspects of inpatient care and assessed potential mechanisms for changes in these outcomes, such as increased physician staffing or changes in the composition of admissions or inpatient procedures performed. We hypothesized that the presence of TJC surveyors would temporarily improve safety outcomes relative to surrounding nonsurvey weeks, with a larger effect in major teaching hospitals, which may have greater resources and incentives to ensure high compliance with surveys.

Methods

Study Sample and Data Sources

To identify hospital admissions and measure outcomes, we used the 2008-2012 Medicare Provider Analysis and Review (MedPAR) files that contain records for 100% of Medicare beneficiaries using hospital inpatient services supplemented by annual Beneficiary Summary Files, which include demographics and chronic illness diagnoses.19 Information on hospital teaching status and geography was obtained from the 2011 American Hospital Association Annual Survey.20 We identified all admissions occurring during the business week of a TJC survey as well as all admissions occurring 3 weeks before and after each survey week, a time interval used in prior similar analyses.21 We included all hospitals surveyed by the TJC with available historical survey dates, representing the majority of admissions from 2008-2012 (described below).22 We excluded admissions occurring on weekend days because TJC surveyors are not on site on these days, and we also excluded admissions from hospitals not surveyed by TJC. This study was approved by the institutional review board at Harvard Medical School.

Defining TJC Survey Weeks

We identified survey dates using publicly available data on the TJC Quality Check website, which lists survey dates going back to 2007.23 We extracted survey dates for 1984 general medical-surgical hospitals in the 2008-2012 MedPAR database, corresponding to 3417 TJC survey visits. These hospitals represent the location for 68% of all Medicare admissions in the 2008-2012 MedPAR cohort. To merge survey visit data with the MedPAR database, we used a crosswalk of TJC and Medicare hospital identifiers obtained from another publicly available TJC hospital quality database.24 We defined our primary exposure variable to be the 5 weekdays during the week of a full accreditation TJC survey date (survey weeks), as reported on the Quality Check website. We did not include TJC follow-up on-site surveys, which are smaller, focused assessments that typically occur from 30 to 180 days after the full survey to assess hospital responses to issues marked as problem areas by TJC surveyors.25

Study Outcomes and Covariates

Our primary outcome was 30-day mortality, defined as death within 30 days of admission. We chose this outcome because it integrates many aspects of care that might change during TJC observation, such as discharge planning and documentation, which are not relevant for outcomes such as in-hospital mortality. Moreover, this mortality time frame is used by Medicare in its reporting of hospital quality.

We also measured 4 additional secondary outcomes to capture possible effects of unannounced surveys on patient safety. The first 2 were composite measures of several patient safety indicators (PSIs), which are validated patient safety metrics developed by the Agency for Healthcare Research and Quality used in public reporting.26 First, we used the PSI 90 composite measure utilized by Medicare in the Hospital Acquired Condition Reduction program, which measures 11 different individual patient safety events, including pressure ulcers and central line–associated bloodstream infection. Second, we examined the PSI 4 composite measure for failure to rescue from serious postsurgical complications. This measure assesses safety on the premise that mortality from a complication can be rescued if recognized early and treated effectively. The PSI 4 measure combines mortality rates for 5 potentially treatable complications in surgical inpatients, including sepsis and pulmonary embolism. The last 2 secondary outcomes included Clostridium difficile infection rates and in-hospital cardiac arrest mortality.27 We hypothesized that TJC surveyor presence could reduce the rates of the 4 abovementioned outcomes through more fastidious attention to infection control and adherence to protocols and best practices for inpatient care.

We ascertained incident cases of C difficile infection by identifying any facility admissions with International Classification of Diseases, Ninth Revision (ICD-9) code 008.45 in any field upon discharge. The sensitivity and specificity of using ICD-9 codes to identify C difficile infections have been reported elsewhere to be adequate for identifying overall C difficile burden for epidemiologic purposes.28,29 Finally, we noted in-hospital cardiac arrest mortality by identifying all admissions with a discharge destination of “expired” with any ICD-9 diagnosis code of 427.5.

Patient covariates included age, sex, race/ethnicity, the Elixhauser comorbidity score for each admission (calculated as the number of 29 Elixhauser comorbidities present during a hospital admission),30 and indicators for each of 11 chronic conditions obtained from the Chronic Conditions Data Warehouse research database (Table 1).19 At the admission level, we used the reported diagnosis-related group to categorize each admission into 1 of 25 mutually exclusive major diagnostic categories.31

Statistical Analysis

The identification strategy relied on the assumption that, because TJC visits are unannounced, patients are quasi-randomized within a hospital to being admitted during a TJC survey week vs the surrounding weeks. We assessed the validity of this assumption in several ways. First, counts of hospitalizations between survey and nonsurvey weeks were compared (3 weeks before and after surveys). Second, unadjusted characteristics of admissions occurring during both periods were compared. Third, to assess balance in admission diagnoses and in-hospital procedural mix between survey and nonsurvey weeks, the cumulative distribution of all admissions in each group were plotted by diagnosis related group and by primary ICD-9 procedure code. Finally, the distribution of survey weeks during the year from calendar years 2008 to 2012 were plotted.

We next translated data into event-time and plotted mean unadjusted mortality rates from 3 weeks before through 3 weeks after the survey week. We calculated unadjusted differences in mortality rates between survey and nonsurvey weeks and then estimated a multivariable logistic model to assess the association of survey weeks with mortality and secondary patient safety outcomes. For each outcome, we fitted the following model:

Logit(E[Y i,j,t,k ]) = β 0 + β 1 Survey_Week t + β 2 Covariates i,j,t,k + β 3 MDC i ,

where E denotes the expected value; Y i,j,t,k is the outcome of admission i for patient j on weekday t in hospital k; survey_week is a binary indicator for an admission happening during a survey week vs 3 weeks before or after the survey; covariates denotes age, sex, race/ethnicity, Elixhauser index score,30 and the presence of 11 different chronic conditions; and MDC denotes indicators for admission major diagnostic category. Our key parameter of interest is the estimate of β 1 , which represents the mean adjusted change in each outcome attributable to the presence of TJC surveyors, compared with combined mean rates 3 weeks before and after the survey week. To present results from this regression, we simulated the absolute change in each outcome attributable to surveyor presence (ie, β 1 ). In all analyses, we used robust variance estimators to account for clustering of admissions within hospitals.32

We conducted prespecified subgroup analyses to explore whether any observed effect was associated with certain types of hospitals or patients. First, we examined major teaching hospitals (defined as ≥0.6 resident-to-bed ratio) vs all other hospitals.21,33,34 We also divided hospitals into the top or bottom half of overall publicly reported quality using the Centers for Medicare & Medicaid Services Total Performance Score.35 Second, we examined whether there were differential effects among patients in the top and bottom 50th percentiles of predicted mortality, hypothesizing that any potential effect of surveyor presence might be magnified in the higher mortality group (eMethods in the Supplement).

Additional Analyses

First, to assess whether the findings could be attributable to chance, we performed a random permutation test. We randomly assigned hospitals to different survey weeks according to the empirical distribution of TJC survey dates in Figure 1 (without replacement) and calculated the unadjusted mortality difference between survey and nonsurvey weeks in 1000 replications. We plotted the distribution of all permutation effect sizes and calculated the percentile distribution and P value for the empirically observed mortality effect. This permutation test adds value to the main analysis as a robustly nonparametric approach to examine how the effect size that we observe fits into the distribution of all possible effect sizes given the distribution of TJC survey dates and hospitals in our sample.

Second, because TJC visits are less common during major holidays (Figure 1), our analysis may be confounded if mortality differs during holidays. We therefore excluded all admissions occurring on Christmas day, New Year’s day, Thanksgiving day, and Fourth of July. Third, because it is possible that mortality differences may occur if the distribution of elective hospitalizations or medical vs surgical admissions differs between survey and nonsurvey weeks, we repeated the main analysis restricted to emergency hospitalizations or stratified by medical vs surgical admissions. Fourth, a related concern may be that the arrival of TJC surveyors leads hospitals to avoid admitting sicker patients later in the week in an effort to free staff resources. We therefore conducted a subgroup analysis among patients admitted during Wednesday to Friday in survey vs nonsurvey weeks. Finally, we examined whether hospitals might increase staffing by counting all unique provider identifiers billing in each hospital by week.

Analyses were performed in R, version 3.1.2 (R Foundation) and Stata, version 14 (StataCorp). The 95% CI around reported estimates reflects 0.025 in each tail or P ≤ .05. P values were estimated using 2-sample t tests or z tests for proportions. Data analysis was conducted from January 1 to September 1, 2016.

Results

Patient and Admission Characteristics

Our sample contained 244 787 admissions during 3417 survey weeks and 1 462 339 admissions in the 3 weeks before and after these survey weeks. The average number of weekly admissions was nearly identical between survey and nonsurvey weeks (Table 1). Patient characteristics were similar between survey and nonsurvey weeks for the entire sample and in major teaching hospitals (Table 1; eTable 1 in the Supplement). In the full cohort, the few characteristics that were statistically significantly different between survey and nonsurvey weeks (eg, age and prior chronic obstructive pulmonary disease or congestive heart failure) were clinically trivial. The cumulative distributions of diagnosis related group categories and ICD-9 procedures were also nearly identical between survey and nonsurvey weeks (eFigure 1 and eFigure 2 in the Supplement). We observed no significant difference in the number of unique providers billing for hospital admissions per week in survey vs nonsurvey weeks (eTable 2 in the Supplement). Last, the distribution of survey weeks across the calendar year demonstrated no evidence for seasonal bias in survey dates beyond expected deviations during holidays (Figure 1).

Mortality During Survey Weeks

In unadjusted analysis across all hospitals, there was a significant reversible decrease in 30-day mortality for admissions occurring during a survey week vs the surrounding 3 weeks (Figure 2A). Overall, unadjusted 30-day mortality was 7.03% during survey weeks vs 7.21% during nonsurvey weeks (absolute difference, 0.18%). Otherwise, 30-day mortality rates in all other nonsurvey weeks were stable (P > .13 for all pairwise comparisons vs third week before survey visit). After adjustment, there was a statistically significant absolute decrease in mortality of 0.12% (95% CI, −0.22% to −0.01%; P = .03) (Table 2). This finding corresponds to an overall relative decrease of 1.5% in the 30-day mortality rate potentially attributable to TJC surveys.

In subgroup analyses, major teaching hospitals showed the largest mortality change associated with survey weeks (Figure 2B and Table 2). In these hospitals, unadjusted 30-day mortality fell from a mean of 6.41% during nonsurvey weeks to 5.93% during survey weeks (unadjusted absolute decrease, 0.49%) (Figure 2B), corresponding to an adjusted decrease of 0.38% (95% CI, −0.74% to −0.03%; P = .04) (Table 2). With adjustment, this translates to a 5.9% adjusted relative decrease in 30-day mortality attributable to survey weeks in major teaching hospitals.

We did not observe significant mortality associations between survey and nonsurvey weeks in hospitals according to whether they were in the top or bottom half of total performance scores (Table 2). However, we noted a significant decrease in mortality during survey weeks among patients in the top half of expected mortality, from 13.37% in nonsurvey weeks to 13.08% in survey weeks. After adjustment, this change corresponded to an absolute decrease of 0.19% (95% CI, −0.35% to −0.03%; P = .02).

Among other patient safety outcomes, we did not find significant differences in outcome rates between survey and nonsurvey weeks, either overall or across the prespecified subgroup analyses (Table 3 and eTable 3 in the Supplement). The only effect that was potentially consistent with decreased mortality was in the PSI 4 measure, although this finding did not reach statistical significance (adjusted difference, −0.85%; 95% CI, −1.9% to −0.20%; P = .13) (Table 3).

Additional Analyses

In the overall analysis, the observed mortality decrease was larger than 99.5% of effect sizes noted in a permutation test of 1000 replications of randomly assigned hospital survey date pairs, corresponding to a 2-sided P value of 0.01 (eFigure 3 in the Supplement). The findings were also unaffected by exclusion of major holidays or by restricting the analysis to emergency hospitalizations or to hospitalizations occurring Wednesday through Friday in survey and nonsurvey weeks (eTable 4 in the Supplement), thereby suggesting that the abovementioned findings are not driven by distribution of TJC survey dates or by a shift toward lower-risk or elective hospitalizations during TJC survey weeks or at the end of a survey week. The mortality effect that we observed was also not dominated by medical vs surgical admissions (eTable 4 in the Supplement).

Discussion

Hospital admissions during TJC survey weeks had significantly lower 30-day mortality than during nonsurvey weeks. The mean weekly number of admissions, patient characteristics, reasons for diagnosis, and in-hospital procedures performed were nearly identical between survey and nonsurvey weeks, consistent with the unannounced nature of these visits and the plausible quasi-randomization of patients between survey and nonsurvey weeks. In subgroup analyses, the decrease in mortality during survey weeks was largely driven by mortality reductions in major teaching hospitals. These mortality changes were reversible since mortality in all hospitals was otherwise stable in the 3 weeks before and after TJC surveys. We did not observe differences between survey and nonsurvey weeks in a broad range of patient safety measures.

These changes suggest that some aspect of predictable behavior change associated with TJC surveys might improve the quality of inpatient care. The effects that we observed were modest in size, ranging from a relative decrease of 1.5% in hospitals overall to 5.9% in major teaching hospitals. However, even changes of this magnitude throughout the year could theoretically have a significant public health impact. At major teaching hospitals, which had the largest relative mortality reduction, given an annual average of more than 900 000 Medicare admissions as defined in this study from 2008 to 2012, an absolute reduction of 0.39% in 30-day mortality (Table 2) would translate to more than 3600 fewer mortality events for Medicare patients annually. We do not propose that the high stress and scrutiny of TJC surveys be replicated across the entire year. Instead, we view TJC surveys as a window into quality improvement that is likely driven by a small number of key changes during surveys that require further research.

The study results are unlikely to be driven by chance or selection bias alone for several reasons. First, patient demographics, chronic illnesses, diagnosis related groups, and in-hospital procedures were similar between survey and nonsurvey weeks. Moreover, if selection was likely to be an important issue, one would expect differences in the number of hospitalizations between survey and nonsurvey weeks; however, the numbers were identical during both periods. Finally, we demonstrated the robustness of the findings by excluding major holidays (during which TJC visits are less likely to occur), focusing on emergency hospitalizations (to ensure greater homogeneity in the comparison of hospitalizations across survey and nonsurvey periods), and focusing on hospitalizations occurring from Wednesday to Friday (to address the possibility that hospitals may respond to TJC visits by altering the composition of patients hospitalized later in the week).

Several possible reasons may explain why mortality declined during TJC survey weeks. The most plausible mechanism for these results could be that heightened scrutiny during visits raises awareness of possible operational deficiencies that could improve patient safety. For example, increased attention to methods of paper documentation could lead to more carefully documented encounters and better communication during survey weeks than other weeks. Surveyor presence on hospital floors and operating rooms could improve compliance with hand hygiene and infection control protocols, reducing hospital-acquired infection rates. Surveyor presence may also reduce the time spent by hospital staff on other activities with the potential to distract focus from patient care. To the extent that we were able to capture these mechanisms through secondary outcomes, we did not observe any effects associated with survey weeks that can be closely tied to changes in specific patient safety practices. The decrease observed in the PSI 4 failure-to-rescue mortality measure is suggestive of an effect of increased vigilance on the response to patient complications, but this effect did not reach statistical significance. Because the secondary outcomes we can measure in claims data are relatively uncommon, the analysis could be underpowered to detect a small change of the magnitude that was observed for 30-day mortality.

The prominent reduction in mortality in major teaching hospitals was a strong driver of the overall mortality change and a notable finding. This finding was consistent with our a priori hypothesis that the largest teaching hospitals may have a greater mortality change because they are better able to mobilize staff resources due to their size and more motivated to do so because they have a reputation at stake. One strategy for health systems to consider would be to observe which aspects of normal day-to-day operations change most dramatically in their institution to meet survey readiness standards (eg, clean environment and proper documentation). Those changes may be the best opportunities to identify whether more continual attention could improve patient safety.

Limitations

Our study had several limitations. First, the observational nature precludes interpreting the findings to reflect a causal link between TJC surveyor presence and reduced mortality. However, the unannounced nature of surveys supports a strong observational, quasi-randomized study design that is supported by similar admission characteristics between survey and nonsurvey weeks. Second, the modest effect size limits statistical power to assess similar effect sizes across the secondary outcomes to find suggestive mechanisms explaining the primary results. It is likely that we cannot rule out an effect on PSI 90 safety measures, infection control as measured by C difficile rates, or hospital operational readiness as measured by in-hospital cardiac arrest mortality given the baseline rates that we observed for these outcomes. More generally, we were unable to identify a specific mechanism by which mortality is reduced during TJC surveys. Future work could consider additional potential explanations, including hospital safety culture or other hospital characteristics.36 Third, the exposure could be measured with error if not all TJC surveys required 5 weekdays during the survey week to complete. However, this potential issue would bias the findings to the null, most likely leading to underestimation of the true effect. Finally, our analysis was limited to the Medicare population and may not generalize to commercially insured populations hospitalized during TJC survey weeks.

Conclusions

We observed lower 30-day mortality for admissions occurring during TJC survey weeks compared with nonsurvey weeks, particularly among major teaching hospitals. This observation could be explained by heightened attention by hospital staff to multiple aspects of patient care during intense surveyor observation and suggests that differential behavior during survey weeks may have meaningful effects on patient mortality.

Back to top Article Information

Accepted for Publication: December 25, 2016.

Corresponding Author: Anupam B. Jena, MD, PhD, Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, MA 02115 (jena@hcp.med.harvard.edu).

Published Online: March 20, 2017. doi:10.1001/jamainternmed.2016.9685

Author Contributions: Drs Barnett and Jena had full access to all data in the study and take full responsibility for the integrity of the data and accuracy of the data analysis.

Study concept and design: Barnett, Jena.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Barnett, Jena.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: All authors.

Obtained funding: Jena.

Administrative, technical, or material support: Barnett, Jena.

Supervision: Barnett, Jena.

Conflict of Interest Disclosures: Dr Jena received consulting fees unrelated to this work from Pfizer, Inc, Hill Rom Services, Inc, Bristol-Myers Squibb, Novartis Pharmaceuticals, Vertex Pharmaceuticals, and Precision Health Economics, a company providing consulting services to the life sciences industry. No other disclosures were reported.

Funding/Support: The study was supported by grant 1DP5OD017897-01 from the Office of the Director, National Institutes of Health (NIH) (Dr Jena, NIH Early Independence Award) and grant T32-HP10251 from the Health Resources and Services Administration (HRSA) (Dr Barnett).

Role of the Funder/Sponsor: The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Disclaimer: The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the HRSA.