This week in the Republican nomination race, Ted Cruz’s win in Wisconsin triggered buzz about how front-runner Donald Trump might be in trouble. Doubtless today’s win in Colorado will intensify the chatter, and will involve words like “momentum.” It is best to ignore all of that coverage – at least until some national polling data shows a sustained change. Why? Because states differ from one another, mostly in demographics but also in rules and various local factors. It is almost impossible to learn something new from a single race. To know where the race stands as a whole, it is necessary to consider all states at once.

In several ways, Wisconsin was typical. With a pre-election poll median of 36.0 ± 1.5% (median ± estimated SEM), Trump’s vote share of 35% was on the mark, continuing his close match between polls and outcomes. Cruz’s finish was also typical, but for a different reason: he was, and is, outperforming his polls. Cruz’s pre-election polls were 39.0 ± 1.2%, and he ended up with 48% of the vote. In previous states, Cruz has overperformed by a median factor of 1.2. Either Cruz’s supporters are exceptionally committed, or he is the beneficiary of anti-Trump votes liberated from their previous first choices, or undecided voters break hard for him, or some combination of the three. In Wisconsin he may also have benefited from the fact that trailing candidates like Kasich often underperform their polls when it is time to vote.

Where is the national race now? The current 6-national-poll median (March 29-April 6) is Trump 39.5 ± 1.2%, Cruz 31.0 ± 2.1%, Kasich 19.0 ± 1.1%. If we were to apply a 1.2-fold bonus to Cruz’s numbers to allow for his overperformance, the corrected numbers are Trump 39.5%, Cruz 37.2% – extremely close. Either way, Cruz has risen quite a bit in the last month, and national opinion is now closely divided.

I have updated the polls-only snapshot of the remaining Republican primaries through June 7th, when voting ends. As I pointed out months ago in The New Republic and The American Prospect, Republican rules are complex and tilt the playing field toward the front-runner, even if he/she doesn’t get a majority of the popular vote. Therefore it is essential to emulate the state-by-state delegate rules with close attention to quantitative accuracy.

Even after getting the rules right, this is a challenging calculation for three reasons: (a) many states lack polls; (b) Cruz overperforms his polls; and (c) delegates may not follow the rules. Today I describe one way of dealing with all of these issues.

For those who just want the bottom line: Since my last update, a poll-based snapshot has moved – in Trump’s favor. If current polls accurately measure voter behavior, then Donald Trump would get a median of 1,356 delegates – almost 120 more than the 1,237 he needs for a first-ballot victory at the national convention in Cleveland. For this probability to drop to 50%, his national lead would have to drop by 8.0% – this is Trump’s Meta-Margin, a measure I have previously developed for general-election Presidential races. However, if Cruz’s overperformance continues, Trump’s lead would narrow considerably, to a count of 1,280 delegates and a Meta-Margin of 2.0%. After allowing for Cruz’s potential overperformance, the probability of a Trump majority is 70% – probable but uncertain. Under such closely divided conditions, the outcome won’t be known until the last primaries, on June 7th.

And now I will explain at length.

Here is a snapshot of current polls, with no correction for Cruz. It gives a median of 1,356 delegates for Trump, 119 more than the 1,237 necessary to get a majority on the first ballot.

It was done under the following assumptions:

1) State polls. Between now and June 7th, the 16 remaining states have 769 delegates, 31% out of a total of 2,474. Only four of these states – New York, Maryland, Pennsylvania, and California – have polls conducted in the last two weeks. A central problem is therefore how to construct the natural variation in state-to-state support. This can be done using national surveys, based on the fact that the national average contains respondents drawn from across the U.S.

To estimate support across the 12 states that lack polls, I used the fact that Trump’s vote share will fluctuate around his national support by some standard deviation (SD). From 2000 to 2012, the Republican front-runner’s SD has been 10-12%. This year, Trump’s SD has been 10%. I assumed SD=12%.

This SD can also be used to estimate how much Trump’s average vote share in these 12 states will deviate from the national average: the standard error (SE) of Trump’s 12-state average is 3.5%, the minimum amount by which Trump’s average will deviate from national numbers. Combining this SE with this year’s polling inaccuracy (about 4%), I estimate that Trump’s average is uncertain by +/-5%. Therefore I varied Trump’s 12-state average by +/-5% around national polls, and assumed that individual states varied around this average by +/-12% (all values are one-sigma).

2) Statewide delegate rules. These rules, as well as district-level rules, were implemented using the detailed descriptions available at The Green Papers. Colorado has no election and was omitted from the calculation.

Ten remaining elections assign at least some statewide delegates on a winner-take-all basis. The exact probability distribution of all possible outcomes is easy to simulate using the same method I have used for the Electoral College. NY and CT statewide delegates are winner-take-all if the top finisher gets above 50%, proportional otherwise. The remaining four states (RI, OR, WA, NM) are proportional.

3) Congressional district-level rules. In nine states, 3 delegates per district are assigned locally. In the past, a candidate’s district vote share typically has varied around the statewide average with an SD of 3-5%. I simulated this with a t-distribution to allow for outliers. Under winner-take-all rules, which apply in most states, the rule is well approximated by an S-shaped curve. The curve is very steep – think of South Carolina, where Donald Trump won all nine Congressional districts. In the code, I also dealt with additional subtleties in the rules for New York, Rhode Island, and Washington that go beyond winner-take-all.

(For an extended discussion of how 2) and 3) above play out in assigning one state’s delegates, see my discussion of California here.)

The biggest uncertainty in the calculation comes in Pennsylvania, where many delegates declare a preference, but strictly speaking are are unbound. I assumed that Pennsylvania delegates will vote according to their district’s voters. This gives Trump 54 delegates on average. In real life, the true allegiance of these delegates is somewhat uncertain.

>>>

The calculation above has two important features. First, in the histogram above, 92% of the probability is at 1,237 delegates or greater. Second, the probability is reduced to 50% if all margins are reduced by 8.0% across the board. This is very similar to the Meta-Margin that I have defined for general election races. For example, if the second-place finisher (Cruz) is underestimated by 8%, that would even up the race. Alternatively, if 4% of GOP voters switch from Trump to Cruz, that would reduce margins by 8%. Either way, a Meta-Margin of 8.0% means that effectively, Trump is 8% ahead in polls.

I do not think Trump’s probability of getting a majority is actually 92%. The biggest reason is that Cruz overperforms his polls. If we reduce Trump’s margins by 6 percentage points based on Cruz’s national numbers, the median outcome is 1,280 Trump delegates – only 43 delegates to spare. Under this assumption, Trump’s probability of a delegate majority is 70%, about 2-1 in Trump’s favor.

Here is what the histogram looks like with a correction for Ted Cruz’s overperformance:

Finally, a note on non-pledged delegates. I have left out the approximately 120 delegates who are either uncommitted or not bindable (see cells B11 and B13 of Taniel’s spreadsheet), and therefore potentially recruitable for the first ballot. With these, Trump’s possible median could be anywhere between 1,226 (no Pennsylvania district-level delegates, no non-pledged delegates) and 1,400 delegates (all of both groups of delegates).

Finally…the scripts, somewhat ugly for now, can be found here, here, and here. I’ll document them better in a little bit. Guardedly, I welcome corrections and comments.

Correction: For NY and CT, the script calculates whether Trump gets over 50%. The Cruz bonus was erroneously applied, but is now removed. This adds 8 delegates to Trump’s total.