The central element of the papers by ADR and DLR is their use of specifications that identify effects only from states in the same Census division (in ADR) or from pairs of contiguous counties straddling state borders (in DLR). The implicit assumption in these specifications is that geographically proximate areas provide better controls. However, as discussed in NSW, one can actually test this assumption using tools borrowed from the synthetic control approach to estimating treatment effects (Abadie et al., 2010). And in the context of the ADR and DLR studies, we showed that the weight put on nearby states or counties as potential controls (or “donors”, in the language of the synthetic control literature) for states or counties in which minimum wages increased were generally no higher than the weight put on states or counties farther away, and indeed that these nearby states or counties tended to get no more weight than a randomly-selected state or county.

In their response, ADRZ claim that we have glossed over an important conceptual issue – namely, that “examining weights within Census divisions and comparing these to weights outside divisions does not tell us whether comparing local areas is better than using state panel regressions with two-way fixed effects” (p. 63). However, we did not present the results of our synthetic control analysis as an explicit validation of the standard two-way fixed effects estimator, and by setting up this straw man, ADRZ distract attention from our main point: The synthetic control analysis is informative about whether to focus on local controls or not because it tells us whether it makes sense to put all the weight on the within-region variation (Census divisions in ADR and cross-border county pairs in DLR) or instead to put weight on variation from outside the region as well (however that weight might be distributed). In particular, for the analysis of state-level data, we reported that the average weight per same-division donor state is higher than 1/(number of potential donors) in only 18 of 50 cases; that is, in more than 60 percent of cases, the average weight on same-division donor states based on the synthetic control matching is less than the weight we would get if all potential donors were weighted equally.

Weights on same-division states vs. other states

ADRZ also dispute our calculation of the weights, arguing (focusing on the CPS analysis) that the evidence actually shows that, “A donor state within the same division receives weights that are 2.8 to 4.1 times as large as weights for donors outside of the division” (p. 66). For example, for our matching on regression residuals in Table 3 of NSW, they calculate that the average weight per donor state in the same Census division is 0.098, versus 0.035 for other states, for a ratio of 2.806 (the source for their first number cited above; see their Table B1). Thus, ADRZ conclude, “a straightforward interpretation of NSW’s own evidence indicates that neighboring areas are more alike than are places farther away – contradicting their central thesis” (p. 66).

This is a direct contradiction of the results we report – i.e., that the weight on same-division states is generally no higher than the weight on states in other divisions. But ADRZ’s conclusion is based on a flawed calculation that weights states in a manner that mechanically tends to produce a high ratio of the weight they compute on same-division versus non-same-division states.3

To see this, let p ij S denote the weight put on state i in treatment j for the same-division states, let N j S denote the number of same-division states in treatment j, and let T denote the number of treatments.4 ADRZ’s calculation for same-division states is

∑ j ∑ i p i j s / T ∑ j ∑ i N j s / T = ∑ j ∑ i p i j s ∑ j ∑ i N j s (1)

That is, they add up all the weight on same-division states across all the treatments, and divide that by the number of treatments, and divide that by the total number of same-division states in all the treatments, also dividing that by the number of treatments. They then do the same calculation for other-division states and compute the ratio of the two.

This calculation puts very high weight on the treatments with large number of donors. In the data, the number of donors varies widely across treatments, and the number of other-division donors can be very large. The number of same-division donors ranges from 1 to 8, with a standard deviation of 2, while the number of non-same-division donors varies from 1 to 45, with a standard deviation of 18. As a result, ADRZ’s calculation is particularly sensitive to the observations on other-division donors from treatments with large numbers of such donors. Because the number of other-division donors can be so much higher, the ratio of the expression in equation (1) for same-division relative to other-division states tends to get blown up by this feature of ADRZ’s calculation.

The top panel of Table 1 provides an illustrative example. The table shows the number of donors in the same and the other divisions in each of five hypothetical treatments and the weights each state gets in the hypothetical synthetic control analysis. In this example, there are four treatments with the same number of same-division and other-division donors (two of each). In these four treatments, the weight on each same-division state (0.24) is slightly less than the weight on each other-division state (0.26). In the fifth treatment, there are also two donors in the same-division states, each with weight of 0.02, and a large number of non-donor states (48) – mimicking what actually happens in the data – each with the same weight as the same-division states (0.02).

Table 1 Examples of weights on same-division states and other-division states Full size table

If these weights resulted from a synthetic control analysis like the one we proposed, what would we conclude? In four of the five treatments, the average weight on other-division states is higher (0.26 vs. 0.24), while in the fifth the weight on same-division states is the same (0.02). We would argue that the interpretation of this kind of evidence from a synthetic control analysis would be similar to the interpretation in NSW: There is no strong evidence that more weight – let alone all the weight – should go on same-division states.

But what does ADRZ’s calculation suggest? Using equation (1) and its equivalent version for other-division states, the resulting value is 3.61 (reported in the table), within the range they use to conclude that same-division states are much more alike – i.e., better controls – than other-division states. Yet looking at the weights for example 1 in Table 1, this does not seem a supportable conclusion.

One could use the unweighted average of the ratio of the weights on same-division to other-division states, which would be equal to 0.938, indicating slightly lower weight on same-division states, which seems like the right answer. Alternatively, if one wanted to use a calculation more comparable to the calculation ADRZ claim to present – “average per-donor weight of same-division donors, relative to the per-donor weight of the other-division donors” (p. 65) – one would want to use the following equation for same-division controls:

∑ j ∑ i p i j S / N j S / T (2)

and the corresponding equation for other-division controls. In that case, example 1 yields the number 0.925 (reported in the table). Thus, the number resulting from ADRZ’s calculation seems much too high.5

Example 2 in Table 1 makes it even clearer that ADRZ’s calculation is flawed. In this example, we simply modify treatment 5 so that there are twice as many same-division donors and twice as many other-division donors – and correspondingly we cut the weight on each state in half. It seems obvious to us that the conclusion one draws about appropriate donors from example 2 should be the same as example 1: There are still four of five treatments for which the weight on other-division states is higher. Yet because the ADRZ calculation upweights the large number of donors, the resulting number increases by more than 50 percent, from 3.61 to 5.59. In contrast, the calculation using equation (2) scarcely changes. Finally, both of the numbers resulting from ADRZ’s calculation far exceed one, even though the weight on same-division states is equal to or less than the weight on other-division states for every treatment.

Thus, our examples and this discussion highlight two conclusions: (1) that the number resulting from ADRZ’s calculation does not have a sensible interpretation with regard to which states are the appropriate controls; and (2) that it is highly sensitive to differences that have no bearing on the evaluation of same-division and other-division states as controls. We therefore stand by the conclusion from our synthetic control analysis that there is little or no evidence to indicate that same-division states are better controls than other-division states, and certainly no evidence that the latter should be excluded as controls.

ADRZ also present a synthetic control analysis that does not use minimum wage increases to identify treatment observations, but randomly assigns a placebo minimum wage law to an individual state in a time period and then calculates the synthetic control donor weights for all remaining states. They suggest that this approach is informative because it “dispenses with the shortcomings” (p. 34) of the kind of analysis we did, by which they mean that we could use only a subset of minimum wage increases as treatments for the synthetic control analysis. Their Figure eight shows that their computed weights decline monotonically with distance from the treated state to the donor state, up to about 1,000 miles (and then are flat). This evidence, they argue, “unambiguously demonstrates that the synthetic control algorithm assigns much greater weight to nearby states when constructing the counterfactual teen employment” (p. 34).

However, this approach strikes us as uninformative about the question at hand – whether a particular subset of states provides a more valid set of controls for states where the minimum wage actually does increase. Since the whole point of the approach taken in ADR and DLR is their presumption that actual minimum wage increases are associated with the residuals of the estimated employment regressions (either because of policy endogeneity or coincidence), we want to know precisely whether the nearby states provide better controls for these treatment observations. As we have already discussed and in contrast to ADRZ’s claims, the data indicate that for the minimum wage increases observed in the data, the same-division states do not provide better controls. We present additional evidence on this below.

Treatments used in the synthetic control analysis

We also present some analyses that attempt to use the synthetic control estimator to identify control observations and then estimate the effects of minimum wages on employment based on those controls. ADRZ criticize our matching estimator because the subset of “clean” minimum wage increases – treatments with donors that have no minimum wage increases in the previous four quarters and the following three – does not produce a negative and significant minimum wage effect (the estimated elasticity is only around −0.06 in the state-level data). In particular, since we suggested in our paper that the fact that we do not replicate the standard panel data estimator using this subset of minimum wage increases made this subset of minimum wage increases “unusual,” they question whether it is valid to use this subset of increases to assess the plausibility of restricting attention to neighboring states as controls, as we do in our synthetic control analysis. We think it is useful to assess whether ADR and DLR throw out states or counties that are valid controls for the subset of minimum wage increases for which the analysis is cleanest. However, the answer does not depend on restricting attention to these minimum wage increases.

In Tables 2 and 3, we show estimates from the synthetic control exercise used in Tables 3 and 5 of NSW, with the matching now done on all observations where there was a minimum wage change. Clearly this is more problematic because the frequency of minimum wage changes implies that “donors” can be contaminated in either the pre- or post-treatment period relative to any treatment state. For that reason, we think the most informative matching is on residuals from the standard employment equation, to account for this minimum wage variation – although this raises the issue of what estimated effect of the minimum wage to use in computing these residuals. We therefore follow what we did in NSW and report these estimates matching on residuals based on the standard panel data estimates, as well as estimates in which we zero out the effect of the minimum wage. These two alternatives can be interpreted as covering the range of most estimates in the debate to date.6

Table 2 Mean synthetic control weights per state in same division and other divisions, CPS data at state by quarter level, 1990 – 2011:Q2 Full size table

Table 3 Mean synthetic control weights per county for contiguous and non-contiguous counties, county-level QCEW data, 1990–2006:Q2 Full size table

Table 2 reports the results for teen employment using the CPS data. Comparisons of columns (1)-(5) with columns (6)-(10) show that almost without exception there is very little basis for restricting attention to the same-division states as controls. The per-state synthetic control weights – computed appropriately, as discussed earlier – are generally quite similar for the same-division and other-division states.7 Table 3 reports the county-level analysis. Here, there is even more compelling evidence that contiguous counties are not better controls, as the per-county synthetic control weight is generally larger for non-contiguous counties.

Figures 1 and 2 report additional information on how the weight assigned to “control” states (or counties) from this analysis varies with distance from the minimum wage increase in question – similar to what ADRZ did in their paper, but now focusing on actual minimum wage variation rather than a randomly-assigned placebo.8 For the analysis that matches on residuals from the specification that restricts the minimum wage effect to be zero, Figure 1 shows that there is a modest increase in the synthetic control weight as one gets closer to a treated state – although a much smaller ascent than what ADRZ suggested.9 And Figure 2 reports the similar graph for counties, where the weight is actually lower for the closer counties. (ADRZ did not report any results along these lines for counties).

Figure 1 Synthetic control weight vs. distance to treatment states (based on Table2, columns (2) and (7)). Notes: “lowess8” uses running-line least squares and a bandwidth of 0.8. The “4” ending indicates a bandwidth of 0.4, and the “m” extension implies that running means are used instead of running-line least squares. Full size image