Partisan polarization is perhaps the most broadly consequential development of the last half-century of American politics. The parties have steadily diverged since the late 1960s, and by some measures are farther apart than at any point in American history. The divide is now so great that it is no longer rare for close observers to wonder if the political system can survive without a larger ballast of representatives in the political center. With political elites divided into zero-sum teams, many of the norms of the US democratic system have begun to fray.

A series of reforms have been proposed to reintroduce this center to American politics. The reforms generally share the assumption that the polarization of American elites is not what the mass public wants—that there are institutions standing between the voters and their desired government that steer outcomes toward the poles. One of the most interesting and closely-watched of these reforms has been the so-called Top Two primary system (hereafter, “Top Two”), a radically open and unconstrained version of the uniquely American institution of popular primaries. A traditional closed primary election only allows voters registered with a party to participate in its nomination contests; an open primary relaxes this constraint to allow participation by at least a subset of non-party members. The Top Two takes this logic near its furthest extreme, offering the same ballot to all voters regardless of party and letting them choose any candidate they like for each office. The two candidates receiving the most votes—also regardless of party—then advance to the general election, raising the novel prospect of intra-party contests in the fall. It turns the primary from its current form—an opportunity for parties to choose their standard-bearers—into a first-stage general election.

In recent years, California and Washington have adopted the Top Two system, raising the reform’s profile and encouraging other states to consider similar changes to their systems. These are forceful experiments testing the role of electoral institutions in modern partisan representation. Moreover, California has also recently tested another radical reform—independent redistricting—with a similar intended effect. If institutional reform is a potential lever in the American democratic system, these reforms amount to grabbing the lever and pulling as hard as possible.

We use the policy changes in these two states for analytical leverage to explore the effect of the political institutions on legislator behavior. Does the system elect more moderate legislators to public office? Might it be a useful tool for counteracting the trend toward greater partisan polarization? Footnote 1 To what extent can we even say that partisan polarization is the product of institutions?

Our examination suggests that the Top Two has had a modest and somewhat inconsistent effect on representation since it was adopted in these two states. The evidence of post-reform moderation is stronger in California than in Washington (and even then only for Democrats), but this moderation is partly due to a contemporaneous policy change—radically new district lines drawn by an independent redistricting commission—rather than from the Top Two itself. There are also some signs that a change in term limits might have played a role. However, the Top Two might have helped to arrest growing liberalism among California Democrats, even as many other states have elected Democrats that are increasingly liberal. While it is still early in the policy experiment, at this point the Top Two appears to be of mixed success as a tool for mitigating polarization. At the same time, there might be a somewhat stronger case for redistricting reform than previously understood.

Background The political elite of the United States has been divided into parties since the earliest days of the republic, but by some measures the divide has grown to historically unprecedented levels. Some scholars have suggested the American system is ill-suited to manage such extreme conflict, and risks more lasting damage and dysfunction. Footnote 2 The last decade or so, culminating in the remarkable 2016 election, has only seemed to confirm the worst fears. Conflict has grown even more intense, and norms that have long tempered this conflict have been jettisoned for the sake of immediate partisan advantage. Moreover, this breakdown of civic discourse has apparently occurred without a foundation in mass preferences. While voters are more polarized on certain issues, a preponderance of evidence suggests elites polarized before voters did and to a much greater extent, and that most voters still reside roughly in the center of the political spectrum. Footnote 3 There are many possible explanations for this disconnect, but among the most popular is that one or more American institutions are either causes or significant enabling factors. Of the possible institutions, perhaps none has been blamed more often than America’s unusual system of popular primaries. Footnote 4 The United States is virtually alone in the world in leaving most decisions about party nominees to rank-and-file party members. Proponents of this explanation point to the dismal turnout rates in primary elections and emphasize that primary voters are far more partisan and ideologically extreme than the ones who vote in general elections. When these voters are favored by primary election rules, they end up choosing like-minded candidates to represent the parties. This leaves more moderate general election voters with a suboptimal choice between two extreme partisans, when they might have preferred a choice between more centrist candidates. If primaries are an important cause of polarization, the most commonly proposed reform has been to open primaries to participation by voters outside the party faithful. With open primaries, the median of the primary electorate moves closer to the median of the general electorate, making it less likely that the preferences of each party’s base voters will determine the final outcome. But open primaries can come in many types, and it is not clear that we should expect all types to be equally effective at promoting the goal of greater moderation in public office. In fact, most open primaries either place limits on which voters can cross party lines in the primary, force voters to choose one party’s primary and vote only for candidates of that party for every office, or both. It is easy to imagine that these designs would significantly discourage crossover voting, and so mitigate any moderating effect. The Top Two primary does not suffer from these limitations. Voters can choose among all candidates for each office, just as they would in a general election, and the top two vote-getting candidates advance to the fall. This makes crossover voting no more difficult than identifying one’s favorite candidate for each office and voting accordingly. Moreover, the Top Two goes farther than other such “nonpartisan” primary reforms by advancing even two candidates of the same party if they have received the most votes. When these candidates represent different factions of the party, it offers a choice to voters of the minority party in a heavily partisan district who might otherwise balk at crossing party lines. Indeed several candidates have explicitly appealed to minority party voters in same-party contests along these lines. This makes the Top Two a much more aggressive effort at promoting moderation, and one that is perhaps more likely to be successful. The 2012 race for California’s 10th Assembly District offers a classic example of a potential Top Two effect. The 10th is heavily Democratic, and in 2012 the leading candidate was Michael Allen, a sitting Democratic member of the legislature from the more liberal union wing of the party. In the primary he faced four more Democrats, a Republican, and an independent. Allen finished first among all Democrats with 31% of the total vote; under a traditional primary he would have faced the Republican (21%) and the independent (4%) in the fall. But one of the Democrats, Marc Levine, just eked out second place with 24%, and so earned a place against Allen in the fall. Levine was generally considered more moderate than Allen, and had in fact been discouraged by the party leadership from running. Footnote 5 In the fall, Levine turned the tables, narrowly beating Allen in a two-way contest by 51 to 49%. Thus Levine claimed the seat in a contest that, but for the Top Two, would not have happened. California also offers more ambiguous examples of Top Two effects. In the same election year as the Allen-Levine race, Richard Roth, a moderate Democrat, ran for and won California Senate District 31. In the primary he defeated a fellow Democrat who was endorsed by the party, and then went on to defeat a Republican in the fall. It is possible that this is an example of a Top Two effect—the outsider moderate defeats the party’s favored candidate in the primary. It might be that independents and Republicans voted for him and pushed him over the top. But moderates can beat the party’s choice in a more traditional system, as well. Thus, while some outcomes enabled by the Top Two are clear, others are harder to identify. Yet both, where they occur, might advance more moderate candidates than prior to the reform. Until recently, the limited use of the Top Two has made it difficult to evaluate as a potential reform. Prior to 2008, only Louisiana and Nebraska used a version of the system for legislative or congressional elections. Both states’ systems predate the availability of broad, national measures of state legislator ideology that would provide the number of cases and temporal variation necessary for more robust causal estimates. Furthermore, neither system is precisely what reformers have in mind when they discuss the Top Two primary today. Louisiana does not hold a follow-up election if one candidate receives more than 50 percent of the vote, which means for legal reasons the state must hold its primary election on the same day that all other states are holding their November general election. Nebraska, for its part, does not include party labels on the legislative ballot, thus removing a key partisan signal and putting the Nebraska system even farther into the nonpartisan category. Neither is an approach that most other states would likely be willing to take. But in recent years both California and Washington have adopted the Top Two as envisioned by reformers—California in 2012 and Washington in 2008—giving social scientists two and four elections, respectively, to observe outcomes under the new system. Each version of the Top Two always holds a runoff election in the fall and includes party labels on the ballot. Footnote 6 For the sake of understanding their possible effect on ideology, these reforms could not have come at a better time. Researchers have made tremendous advances in the measurement of ideology and the broad availability of such measures in recent years. Combined with the policy change in both states, it offers the promise of identifying the effects of the Top Two more robustly than would have been possible before. Both states pose difficulties and opportunities for testing the Top Two’s effects. Washington used a very similar system, the blanket primary, for almost 70 years before it was struck down by the US Supreme Court. After this change, the state used a relatively open system that allowed all voters to choose a party primary each election, albeit with the constraint that they then vote only for candidates of that party. The state used this system for just two election years—2004 and 2006—before returning to the Top Two. In addition, the Washington legislature has no term limits of any kind, which means turnover each election year is low. If we imagine that incumbents would generally find it easier to resist the influence of nomination system change, then the absence of term limits ought to dampen the magnitude of any possible effects. Footnote 7 The same could not be said for California. The state abandoned its blanket primary after 2000, and five election cycles intervened before the state switched to the Top Two. In the interim, it used a relatively restrictive form of Washington’s system, one that only allowed independents to choose a party primary. And with some of the tightest term limits in the nation, California has had ample turnover during this period of time. Indeed, over two-thirds of state’s legislature has been newly elected since the Top Two went into effect. The limitations of Washington as a case study therefore do not apply to California. Footnote 8 California does have analytical limitations of its own, however. The state has been aggressively experimenting with a range of reforms to its existing system, many coming into use at the same time or within a few years of each other. Coincident with the Top Two primary in 2012, the state also began using new congressional and state legislative districts that were drawn by an independent redistricting commission instead of the legislature. The legislators elected that year and every year after have also enjoyed longer term limits from a separate initiative. These new limits arguably bolster incumbency and help each member develop an independent support coalition. And just prior to the Top Two in 2011, the threshold for passing a state budget dropped from two-thirds to a simple majority. In the midst of all these changes, it is important to think carefully about identifying the specific effect of the Top Two, and to distinguish it from the other causes that might reasonably receive credit for any changes. Despite these caveats, these cases are as clear a test of primary effects as one is likely to find. There is temporal variation for stronger causal leverage, the policy change takes a particularly aggressive form, and high-quality data are available both before and after the change. It is an excellent opportunity to test one of the most compelling institutional claims about one of the most important political developments of recent decades. If institutions matter to polarization, they are likely to matter here.

Existing Research Despite the compelling logic of a link between open primaries and moderation, the findings from the research literature have been mixed. There is some formal modeling that suggests the distinct politics of primary electorates is likely to have some effect on representation. Footnote 9 But more complex models that allow for races with more than two candidates—a type common in primary elections—produce inconsistent expectations of the winner’s ideology and are sensitive to the number and characteristics of the candidates who decide to run. Footnote 10 There is also some doubt about how biased and influential the primary electorate is when compared with the general electorate. Crossover voting in open primaries may be fairly limited, and the conditions necessary for it to be decisive (sufficient crossover voting in a race that is otherwise close) may be rare. Footnote 11 It is not even clear that primary electorates are able to discriminate between extremists and moderates in the primary stage, Footnote 12 though signs of such discrimination have been found in general elections. Footnote 13 Ultimately, however, the prospect that an open primary system will produce more moderate elected officials on average does not logically depend on the prevalence of crossover voting, the ability of voters to discern moderates from extremists, or even a general voter preference for moderate candidates. It depends only on the willingness of moderate candidates to run for office and their ability to win votes once they do so. If moderate candidates perceive greater opportunities under an open primary system and are more likely to launch a candidacy in any given contest, then at least some of them are likely to be successful, and hence we should observe more moderation under this system. Moreover, candidates will exploit a system to its fullest in pursuit of public office. Some experimental research has measured this strategic behavior, showing that candidates in Top Two systems reach out more aggressively to potential crossover support. Footnote 14 In the same vein, even if the electorate is entirely innocent of ideological distinctions between the candidates, moderate candidates with enough money and organizational support might draw votes through positive visibility alone, absent any ideological cues. Perhaps it is not surprising, then, that the evidence for an effect of primaries on moderation in office is more mixed than the research focused on the electorate would suggest. The broadest studies to date of the effect of primaries on representation have been consistent with the null effects from election research: neither the competitiveness of the primary election, Footnote 15 the extremeness of the primary electorate, Footnote 16 nor most critically, the type of primary system, Footnote 17 seems to have much effect on the ideology of those who are ultimately elected. Yet more localized effects have been found in certain cases, especially California. Virtually every study that has looked at the effect of the state’s “blanket” primary in 1998 and 2000 has found a small but notable increase in moderation during that time. Footnote 18 There is also some evidence that the state’s “crossfiling” system from the first half of the twentieth century had a similar moderating effect. Footnote 19 This raises the prospect that there is something unusual about California that makes it especially fertile ground for primary system reforms to produce the desired changes. Footnote 20 Consistent with this idea, evidence from California’s most recent experiment with the Top Two primary has suggested some moderating effect, especially among Democrats in the legislature. Footnote 21 Anecdotally, there is a widespread sense that the legislature’s Democratic caucus is now more business-friendly and that the Top Two primary is part of the explanation. Footnote 22 Combined with the logic for a stronger Top Two effect mentioned above, it offers reason to believe that the reform has had the desired effect in this particular case. At the same time, the matter is far from settled, even for California’s most recent experiment. At least one study finds no consistent effect of the Top Two on the moderation of California’s congressional candidates’ platforms relative to the districts they represent. Footnote 23 In sum, the research findings are somewhat in conflict. Given the intense interest in the more recent applications of the Top Two, especially in California, it is important to resolve this discrepancy if possible. In what follows, we take a more careful look at the effect of the Top Two in both California and Washington. Has the reform had the desired effect of increasing moderation, independent of other changes over time, in either of these cases?

Data The goal of our analysis is to leverage the policy change in Washington and California, as well as the experience of other states not subject to the same policy intervention, to help isolate the causal effect of the Top Two reform. To ensure that we do not confuse the effect of the Top Two with the effect of the similar blanket primary, we begin our study period in 2004, the first year in which no state used the blanket primary for its nominations. To compare ideology across states and time while also accounting for other reforms, we rely on two different measures of ideology. Our primary measure is the ideal points developed by Boris Shor and Nolan McCarty. Footnote 24 Their method first derives legislature-specific ideal points from roll calls cast by all incumbents in each one. It then derives common space ideal points using the National Political Awareness Test (NPAT), a common issue-preference survey sent to all candidates for state or national office across the country. Using politicians who both served in office and responded to the survey, the method then projects the rollcall-based ideal points of all legislators into a common space created by responses to the NPAT. This provides a single measure of ideology that is comparable over the entire period we study here, as well as across all the institutions included in the analysis. As noted in previous applications of this metric, Shor/McCarty ideal points are not dynamic: they provide a single ideal point for each politician for the entire study period. Thus, we cannot explore conversion or adaptation effects, where a policy change alters the behavior of sitting elected officials. Instead, we must focus on selection effects, where the new primary system encourages a different sort of candidate to run and helps these candidates win office. While this constrains the analysis to a certain extent, the impact is likely to be minimal. First, incumbent officeholders rarely change their voting behavior, Footnote 25 so using only newly-elected officeholders will, if anything, bias our results toward finding a significant effect for the Top Two. Second, one might be concerned that a lack of turnover will limit our analytical power. But the Washington state legislature has been operating under the Top Two for four election cycles, offering enough cases for the analysis. California, for its part, has seen extraordinary levels of turnover in the two election cycles since the reform was implemented, with over two-thirds newly elected under the system. We also use adjusted Chamber of Commerce scores to overcome some of these limitations. Footnote 26 These dynamic scores provide the temporal comparability within each state of the Shor/McCarty ideal points, and give us leverage on the question of term limits effects by allowing us to say something about continuing legislators. They also offer a specific policy domain to test for effects, instead of the broad range of issues in the Shor/McCarty ideal points. These advantages come at a significant cost in causal leverage, because we cannot compare states to each other on a single ideological dimension. There are also well known problems with using interest group scoring as a measure of ideology. Footnote 27 Nonetheless, used as a complement to the common space NPAT ideal points, the adjusted Chamber of Commerce scores will help us piece together an account of these reforms through a constellation of evidence. Footnote 28 Because we use ideal points derived from broad questions about multiple policy areas and large numbers of roll call votes, our goal is to identify changes in general ideological dispositions. These broad tendencies do not necessarily predict how legislators will vote on specific bills. Even the Chamber of Commerce scores, though more narrow in focus, concern broad dispositions and not concrete decisions about specific policy issues or bills.

California and Washington in Isolation Figure 1 plots the average Shor/McCarty ideal point of the newly-elected legislators in each state over time. California and Washington, the two states that adopted the Top Two during this study period, are highlighted, and the last election each conducted before using the Top Two is identified. By convention, the ideal points are coded so that more positive values are more conservative. Thus, if the Top Two produces more moderate representation, the average Republican ideal point should be lower (more liberal) after the change, while the average Democratic ideal point should be higher (more conservative). There are some signs of these effects in California, where Republicans reached their peak conservatism just prior to the implementation of the Top Two, and where the first Democrats elected under the Top Two were noticeably more conservative. At the same time, there is little sign of any change in Washington: the average ideal point after the implementation of the Top Two is about the same for Republicans, and actually trends in a more liberal direction for Democrats. These time trends are only the start of the analysis for each state. We must take seriously the possibility that some other change coincident with the Top Two either hampers or accounts for the Top Two’s effect. We should pay special attention to two alternative explanations in California: a new independent redistricting commission whose radically redrawn congressional and legislative lines were first used in 2012; and longer term limits for state legislators that applied only to those who were newly elected in 2012 or later. Among these reforms, redistricting operates through a different mechanism than the others. Both the Top Two and the extension of term limits purport to alter the relationship between a district’s partisanship and the people elected to represent it. For instance, the Top Two aims to shift the median voter in the primary election and raises the prospect of intra-party competition in the general election, potentially creating different incentives for districts of all partisan complexions. Likewise, longer term limits might make all new legislators more moderate by giving them the time horizon necessary to build a supporting coalition that is independent from the party. Redistricting reform, by contrast, seeks to change the distribution of the voters across districts, making for a larger number of competitive districts in the process. To produce more moderation, candidates elected under the new districts do not need to behave differently from those elected to represent districts of similar partisanship in the past. This offers some analytical leverage for identifying the independent effect of redistricting. Conditional on district partisanship, a pure redistricting effect produces no ideological change: it is felt only through changes in the distribution of district partisanship. Thus, to the extent that a moderating effect remains independent of changes in district composition, we can be more confident that it is a Top Two effect (though it might also be a term limits effect, a point we return to later). Of the two states to have adopted the Top Two, redistricting produced a much larger compositional change in California than in Washington. The California districts that elected new representatives under the Top Two were notably more competitive than the ones before, especially for Democrats and especially in 2012. Before the Top Two, California’s Democrats were elected from districts that were on average 9.1% more Democratic than the statewide Democratic presidential vote share. Footnote 29 Under the Top Two, they have been elected from less Democratic districts that average 7.6% above the statewide mean. Effectively all of this change came in 2012, when the average new Democrat was elected from a district that was only 6.4% more Democratic than the rest of the state; by contrast, the 2014 class of new Democrats averaged 9.5% above the rest of the state, very much in line with the pre-reform status quo. Furthermore, there are no signs of any similar change for California Republicans or new legislators from either party in Washington that could explain a moderating effect. The districts electing Republicans in California before the reform averaged 14.4% below the statewide Democratic presidential vote outcome prior to the Top Two, and 15.3% after. Likewise, in Washington, Democrats were elected from districts 7.0% above the statewide average before reform and 8.5% after, while Republicans were elected from districts 9.4% below the statewide average before and 9.2% after. Figure 2 graphs legislator ideology against the district Democratic presidential vote, measured as the deviation off the statewide average. Two conclusions are visible from these charts. First, there is considerable overlap between the pre- and post-reform ideology in both states. The patterns are similar enough that it is not immediately clear that there has been any change at all. Second, the relationship between representative and constituency is notably non-linear, especially for Democrats, and the precise shape of the relationship differs somewhat by state. This suggests we should be careful to employ a flexible modeling strategy. Toward that end, we employ matching to identify increased moderation while avoiding any particular functional form. Matching identifies similar cases on a set of covariates and calculates the average difference between these cases across the entire matched set. Footnote 30 For this analysis, we match the post-reform legislators to the pre-reform legislator with the most similar district partisanship using the Matching package for R. Footnote 31 The smaller the matched difference compared to the simple pre-matching difference of means, the more we can say that the redistricting accounts for any observed effect. Table 1 presents the results for each party in each of the two states. Prior to matching, there is no statistically certain difference between those elected after reform and those elected before in three of the four cases. In fact, the only clear difference is for Washington Democrats, who appear notably more liberal after the reform. Matching on presidential vote eliminates a substantial share of this difference for Washington Democrats, and also weakens the difference (such as it is) for California Democrats. Footnote 32 It is possible that the results in table 1 are a function of our outcome measure. Because the Shor/McCarty ideal points are not dynamic, they are fundamentally dependent on comparing new legislators in one year to new legislators in previous years. The effects of the reform might be more strongly felt among continuing legislators than among newly-elected ones, if perhaps longer-serving incumbents are cleverer about adapting campaign platforms to institutional incentives. Moreover, the Shor/McCarty data represent a broad cross-section of roll call votes that is projected into the ideal point space as defined by the National Political Awareness Test. It may be that the reforms have had a much more important effect on some subset of the issue space, rather than on the broad range of issues captured by the Shor/McCarty measure. To address both concerns simultaneously, we repeated the above analysis using adjusted Chamber of Commerce scores. The Chamber of Commerce (called the Association of Washington Business in Washington) is an interest group that lobbies to support a low-tax, business-friendly regulatory environment, and the Chamber scores all legislators each legislative session, offering the prospect of dynamic scores on a focused (and very important) policy area. Footnote 33 Because the policy agenda can change over time, and because the Chamber of Commerce itself can alter the list of bills it chooses to score for strategic reasons, we adjust the scores according to the method described by Groseclose et al. Footnote 34 These adjusted Chamber of Commerce scores, presented in table 2, do suggest more change in California. California Democrats now show a notable change in the direction that would be expected, registering as 5.5 points more conservative. However, California Republicans are 1.8 points more conservative, an effect contrary to the one that would be expected. Washington Republicans are 1.4 points more liberal, and Washington Democrats are 0.8 points more liberal. As before, controlling for district partisanship through matching weakens these effects in all cases, and actually flips the sign for Washington Democrats. It also leaves a robust difference of 4.0 points for California Democrats, suggesting we can more confidently speak of greater moderation in that caucus by this measure. Because the Chamber of Commerce scores include both newly-elected and continuing legislators, they allow us to explore the effect of term limits reform on the changes in California. Continuing legislators are still covered under the old limits, while newly-elected members have run under the new, more relaxed limits. Table 3 separates the numbers from table 2 into these two groups. In this context, “newly-elected” means completely new to the legislature. Thus, anyone with a history in the legislature who was elected to a new district (for instance, making the transition from the Assembly to the Senate) would be included in the analysis of table 1 as a newly-elected member, but would be considered “continuing” here. There is little sign of any change for Republicans in either group, whether controlling for redistricting or not, while Democrats in both groups are more moderate. Some of this change is due to redistricting, especially for newly-elected Democrats. This group also exhibits the largest change in moderation: in fact, most of the increase in support for the Chamber among Democrats appears to fall in this group alone, which is 7.3 points more supportive even after adjusting for district composition. These results complicate the story somewhat. There is clearly some increase in moderation among Democrats that stands independent of both redistricting and term limits reform. It is not difficult to imagine that this residual is a conversion effect caused by the Top Two. But the much larger magnitude change for newly-elected legislators means we cannot dismiss a substantial term limits effect. Prior research that found a moderating effect in California under the blanket primary—at a time when newly-elected members would be under more rather than less restrictive term limits—suggests at least some of the effect is still about primary regime. In fact, it is certainly possible that the larger difference for newly-elected members in table 3 simply captures the selection effect of the Top Two, in contrast to the conversion effect visible with continuing members. More analysis will be required to disentangle these effects. Members of the US House can serve as a final way to test a potential term limits effect. Congressional candidates are subject to the Top Two but not term limits. Any change in ideology among members of the House delegation, conditional on district partisanship, would be a clearer sign of a Top Two and not a term limits effect. The challenge with the House delegations is that there are so few new members for the sake of calculating Shor/McCarty scores. We turn to DW-NOMINATE scores, which are dynamic and so allow us to use all members both pre and post. Footnote 35 Table 4 contains these results. They suggest little in the way of moderation either with or without matching. California Democrats show the biggest change, but it is still too small to merit substantive concern. This further points toward term limits as the cause of the differences in table 3. In sum, a more careful examination of the individual cases of California and Washington finds inconsistent evidence of an effect for the Top Two primary. The main exception is California Democrats: there is some evidence that they moderated, but redistricting explains part of this change and new term limits may explain still more. Since our estimate of a Top Two effect amounts to a residual difference after other explanations are controlled, there might certainly be some other effect at work that we have not measured directly. But the Top Two may still be a plausible explanation for some of the residual differences we have found.

Difference-in-Differences Design The evidence to this point suggests that the effect of the Top Two primary on moderation is inconsistent, but that there may be some effect in California. But are these observed changes unique to California, or are they common to other states that have not adopted the same political reforms? In other words, have Democrats in other states also moderated? To address this issue, we place the policy change in California and Washington in context with a classic difference-in-differences (DID) design. Footnote 36 A DID design identifies a policy effect by comparing the post-treatment change in the state of interest to similar changes in states that did not adopt the treatment. In moving to the DID design, we do not want to abandon the flexibility of the matching approach. Instead, we combine the two methods by first matching pre- and post-treatment legislators separately for each state. The treatment period is always defined as the period during which the treatment state used the Top Two—2008 and later for Washington, and 2012 and later for California. Footnote 37 We then calculate the difference between the difference in the treated state and the average difference for all other states. Footnote 38 For this exercise we use the Shor/McCarty ideal points, since the adjusted Chamber of Commerce scores do not offer a common space for cross-state comparisons. When placed in the context of other states, California Democrats suddenly stand out on this measure in a way they did not even for the Chamber of Commerce scores (table 5). Prior to matching, the DID estimate is much larger than the simple difference of means from table 1 (0.22 versus 0.09). Moreover, matching shrinks the DID estimate only to 0.14, still far larger than the comparable estimate of 0.03 from table 1. In short, while Democrats in California have grown slightly more conservative, Democrats in other states have grown even more liberal in this time period. This relative effect makes the Democratic moderation in California more notable. However, there is still no clear sign of a moderating effect for the other groups. The matched DID effect for Washington Republicans is of modest size and in the correct direction, but far too noisy to inspire much confidence. The other effects are small and statistically insignificant. Even the results for the Democrats are not quite as confident as they might at first appear. Because the analyses in table 4 present multiple opportunities for success, there is a greater opportunity of finding a statistically confident result by chance alone. We tested this with a multiple comparison correction to the p-values. Footnote 39 This adjustment placed the probability of a null result for California Democrats marginally over the 0.05 threshold, at 0.076. We can again use the US House delegations as a reference point to gain leverage on the contribution of term limits to these results. We ran the same difference-in-differences matching design with DW-NOMINATE scores, and the results were far weaker (refer to table A2 in the appendix). Democrats became slightly more moderate before matching, but that difference largely disappeared after matching, and no other group demonstrated similar effects. This again points toward term limits as a possible explanation for the effect in table 5, since the Top Two change applies to the members of Congress but the term limits change does not.

Discussion The growing ideological gap between the two parties has transformed American politics over the last several decades and at times has prompted concerns for the basic viability of the political system. Those who seek institutional levers to bring the parties back together have promoted primary reform perhaps more than any other change. In this context, the recent experiments with primary reform in Washington and especially California have attracted international attention. Yet there has been relatively little quantitative evidence of the effects. This analysis has been the first attempt to do so. We examined each of these states both in isolation and in broader national context. The results of these analyses suggest virtually no effect of the Top Two in Washington or for Republicans in California. The same analyses do, however, suggest some effect among Democrats in California, though a portion of this effect appears to stem from the redistricting that occurred coincident with the Top Two. Our analysis also considers possible effects from other sources. Relaxed term limits went into effect at the same time as both the Top Two and the redistricting, but this change does not appear to account for all of the change in Democrats. That said, the residual pre/post change after accounting for the other potential causes leaves only a small shift to explain. Any effects we do find are limited to Democrats in California alone. It is worth noting the limits of our analysis. We feel relatively more confident about the role of redistricting, since we have measured the source of those effects more directly. We can also be reasonably confident about the role of term limits, since we have comparison groups for whom the term limits change did not apply: continuing legislators and members of Congress. These groups show far smaller pre/post effects, suggesting that term limits may explain still more of the difference. By contrast, at this point our results do not conclusively demonstrate that the Top Two primary is the cause of the residual effect we have found. It certainly gives pause that there has been no similar effect among either California Republicans or Washington legislators of either party, and that some measures show larger effects than others even for California Democrats. Nonetheless, the evidence here is consistent with a small effect from the Top Two for California Democrats. And, as mentioned at the outset, there are reasons to think that the Top Two’s effect in Washington would be limited, since the state’s experience with a more partisan system was transitory and turnover in its legislature is relatively low. There are also reasons to add a note of caution about the California results, since the state has only experienced three election cycles under the Top Two thus far. Different patterns of behavior might emerge as candidates and voters come to learn the system better over time. It is important to reiterate that our goal has been to identify broad ideological tendencies by observing behavior across a wide range of bills and issue areas. We do not identify whether the probability of passing any given bill has changed, because it is very rare for exactly the same bill to come to consideration across multiple election cycles. To the extent that some bills are more important than others, there might still be more moderation in a way that is difficult to conclusively detect. Legislators might be more inclined to take moderate positions on significant bills and vote with the extremes on minor bills as a way of satisfying those interests. Of course, the opposite could also be true: legislators might seek to appear moderate by voting across party lines on minor bills but then standing with their own party on the most politically important issues. This is a more difficult issue to resolve and one we do not address here. On the other side of the ledger, it is also reasonable to express caution about the long-term impact of the redistricting effects we have uncovered in California. The redistricting effect we have identified is largest for newly-elected politicians. The districts they were elected from were not necessarily representative of the broader universe of districts. In fact, the legislators elected in 2012 came from a set of districts that was unusually competitive relative even to the more competitive set of districts in the new plan. Those elected in 2014, by contrast, came from a much more typical set of districts. The redistricting effect, such as it is, might gradually settle into a new equilibrium that is slightly, but not dramatically, more moderate than the old. Unlike with the unusual Top Two primary system, there is no learning required to represent competitive districts—politicians have plenty of experience with the practice. The evidence for a redistricting effect but an ambiguous Top Two effect fits well within some strands of the existing research but not others. Much of the extant research supports the idea of limited primary effects. Explanations for this limited effect are speculative at this point, but may include anything from fundamental voter loyalty to parties, to the gatekeeping powers of party activists and donors, to the surprisingly contingent logic of open primaries (dependent as it is on candidate emergence decisions). The evidence presented here does not allow us to favor one of these explanations over another, but it does help us confirm the contingent nature of a primaries effect. Even a radically open system like the Top Two appears to have made at best a modest difference in the behavior of representatives, at least at this point in the policy experiment. In contrast to these more congruent findings, the evidence for a modest redistricting effect in California might seem inconsistent with some research showing null effects of redistricting, at least on congressional representation. Footnote 40 But this research has never claimed that the correlation between district partisanship and representation is zero, only that the changes in district composition attributable to redistricting have not been significant enough to account for much of the observed growth in polarization over time. The California legislature remains highly polarized even under the reforms, and is still a long way from the weak partisanship it exhibited in the post-World War II period. Footnote 41 Moreover, the unusually competitive set of districts that elected new representatives and legislators in California in 2012 might have helped produce more notable effects in this case. On balance, then, the findings presented here offer something for both supporters and opponents of political reform. For supporters, we have found evidence, however limited and preliminary, that redistricting reform can have the moderating effects that might be hoped for it. Given growing interest in this style of independent redistricting commission and the explicit sanction the US Supreme Court has recently given to such an approach, some might take this as a green light to explore the possibility elsewhere. We have also found some signs of a Top Two primary effect, though more research is necessary to confirm it when contrasted with alternative explanations such as the change in term limits. For opponents of reform, on the other hand, the size of the Top Two effect is limited to one party in one state, and it is strongest only when considered among newly-elected legislators and in the context of a Democratic party that is moving leftward everywhere else. These results are preliminary, and do not necessarily speak to the merits of these reforms more generally, since moderation was not the only benefit supporters claimed for them. But the effects are conditional enough to broaden the conversation to these other benefits as possible reasons for supporting reform as well. At any rate, the Top Two is an especially strong example of the sort of institutional reforms that might draw American parties back toward the center of the ideological spectrum. The evidence for some effect bolsters the idea that institutions are at least partly to blame, but given the magnitude of the policy change the effect is fairly weak. Thus, while institutions and primaries in particular may be part of the story, the lion’s share of polarization likely comes from some other source.