How This Study Was Conducted

This edition of Mirror, Mirror reflects refinements to methods used in past reports. No report can claim to capture every aspect of the performance of health care systems. Health care systems are complex. Even if a report included thousands of measures, nuances would remain. In that spirit, the report underwent a thorough review by an advisory panel of international, independent performance measurement experts. The framework for Mirror, Mirror 2017 was developed in consultation with the advisory panel from January through December 2016.

Using data available from Commonwealth Fund international surveys of the public and physicians and other sources of standardized data on quality and health care outcomes, we identified 72 measures relevant to health care system performance, organizing them into five performance domains: Care Process, Access, Administrative Efficiency, Equity, and Health Care Outcomes. The criteria for selecting measures and grouping within domains included: that the measure be important, that the data to support the measure be standardized across the countries, and that the results be salient to policymakers and relevant to performance improvement efforts. Most of the measures are based on surveys designed to elicit the public’s experience of its health care system.

The indicators were carefully selected from among the best-available measures with comparable data across the included countries. The selected measures cover a wide range of performance domains. Mirror, Mirror is unique in its use of survey measures designed to gather the perspectives of patients and professionals—the people who experience health care directly in each country every day.

The data for this report were derived from several sources. Survey data are drawn from the 2014, 2015, and 2016 Commonwealth Fund International Health Policy Surveys. Since 1998, in collaboration with international partners, the Commonwealth Fund has supported these surveys of the public’s and primary care physicians’ experiences of their health care systems. Each year, in collaboration with researchers in the 11 countries, a common questionnaire is developed, translated, adapted, and pretested. The 2016 survey was of the general population; the 2014 survey surveyed adults age 65 and older. The 2016 and 2014 surveys examined patients’ views of the health care system, quality of care, care coordination, medical errors, patient–physician communication, waiting times, and access problems. The 2015 survey was administered to primary care physicians, and examined their experiences providing care to patients, the use of information technology, and the use of teams to provide care.

The Commonwealth Fund International Health Policy Surveys (2014, 2015, 2016) are nationally representative samples drawn at random from the populations surveyed. The 2014 and 2016 surveys sampling frames were generated using probability-based overlapping landline and mobile phone sampling designs and in some countries, federal registries; the 2015 survey was drawn from government or private company lists of practicing primary care doctors in each country. Appendix 7 presents the number of respondents and response rates for each survey, and further details of the survey methods are described elsewhere.

In addition to the surveys, other standardized comparative data were drawn from the most recent reports of the Organization for Economic Cooperation and Development (OECD), the European Observatory on Health Systems and Policies, and the World Health Organization (WHO). Our study included data from the OECD on screening, immunization, preventable hospital admissions, population health, and disease-specific outcomes. The WHO and European Observatory data were used to measure population health.

We used the following approach to calculate performance scores and rankings for comparison:

Measure performance scores: For each measure, we converted each country’s result (e.g., the percentage of survey respondents giving a certain response or a mortality rate) to a measure-specific performance score. This score was calculated as the difference between the country result and the 11-country mean, measured in standard deviations. Normalizing the results based on the standard deviation accounts for differences between measures in the range of variation among country-specific results. A positive performance score indicates the country performs above the 11-country average; a negative score indicates the country performs below the 11-country average.

The 11 measures in the equity domain were derived from the 2016 population survey and calculated by stratifying the population samples based on reported income (above-average vs. below-average relative to the country’s median income). Performance scores were based on the difference between the two groups, with a wider difference interpreted as a measure of lower equity between the two income strata in each country.

Domain performance scores and ranking: For each country, we calculated the mean of the measure performance scores in that domain. Then we ranked each country from 1 to 11 based on the mean domain performance score, with 1 representing the highest performance score and 11 representing the lowest performance score.

Overall performance scores and ranking: For each country, we calculated the mean of the five domain-specific performance scores. Then, we ranked each country from 1 to 11 based on this summary mean score, again with 1 representing the highest overall performance score and rank 11 representing the lowest overall performance score.

We tested the stability of this ranking method by running two tests based on Monte Carlo simulation to observe how changes in the measure set or changes in the results on some measures would affect the overall rankings. For the first test, we removed three measure results from the analysis at random, and then calculated the overall rankings on the remaining 69 measure results, repeating this procedure for 1,000 combinations selected at random. For the second test, we reassigned at random the survey measure results derived from the Commonwealth Fund international surveys across a range of plus or minus 3 percentage points (approximately the 95 percent confidence interval for most measures), recalculating the overall rankings based on the adjusted data, and repeating this procedure 1,000 times.

The sensitivity tests showed that the overall performance scores for each country varied, but that the ranks clustered within three groups (Exhibit 3). Among the simulations, the U.K., Australia, and the Netherlands were nearly always ranked among the three top countries; the U.S., France and Canada were nearly always ranked among the three bottom countries. The other five countries varied order between the 4th and 8th ranks.

This report has several limitations. Some are related to the particulars of our analysis and some inherent in any effort to assess overall health system performance.

First, as described above, our sensitivity analyses suggest that the overall country rankings are somewhat sensitive to small changes in the data or indicators included in the analysis.

Second, despite improvements in recent years, the availability of cross-national data on health system performance remains highly variable. The Commonwealth Fund surveys offer unique and detailed data on the experiences of patients and primary care physicians. However, they do not capture important dimensions that might be obtained from medical records or administrative data. Furthermore, patients’ and physicians’ assessments might be affected by their expectations, which could differ by country and culture. In this report, we augment our survey data with other international sources, and include several important indicators of population health and disease-specific outcomes. However, in general, the report relies predominantly on patient experience measures. Moreover, there is little cross-national data available on mental health services and on long-term care services.

Third, we base our assessment of overall health system performance on five domains—Care Process, Access, Administrative Efficiency, Equity, and Health Care Outcomes—which we weight equally in calculating each countries’ overall performance score. In the past some have argued there are other important elements of system performance that should be considered as well, such as innovativeness or value. After consideration, and based on discussions with our advisory panel, we decided not to add new domains to the report. We believe our current five domains capture a sufficiently broad and comprehensive view of health system performance. In addition, there was a lack of meaningful data to assess these new domains.