In oral arguments before the Supreme Court last week, Chief Justice John G. Roberts Jr. introduced a statistical claim that he took to imply that an important provision of the Voting Rights Act has become outmoded.

Section 5 of the Voting Rights Act, which is being challenged by Shelby County, Ala., in the case before the court, requires that certain states, counties and townships with a history of racial discrimination get approval (or “pre-clearance”) from the Department of Justice before making changes to their voting laws. But Chief Justice Roberts said that Mississippi, which is covered by Section 5, has the best ratio of African-American to white turnout, while Massachusetts, which is not covered, has the worst, he said.

Chief Justice Roberts’s statistics appear to come from data compiled in 2004 by the Census Bureau, which polls Americans about their voting behavior as part of its Current Population Survey. In 2004, according to the Census Bureau’s survey, the turnout rate among white voting-aged citizens was 60.2 percent in Mississippi, while the turnout rate among African-Americans was higher, 66.8 percent. In Massachusetts, conversely, the Census Bureau reported the white turnout rate at 72.0 percent but the black turnout rate at just 46.5 percent.

As much as it pleases me to see statistical data introduced in the Supreme Court, the act of citing statistical factoids is not the same thing as drawing sound inferences from them. If I were the lawyer defending the Voting Rights Act, I would have responded with two queries to Chief Justice Roberts. First, are Mississippi and Massachusetts representative of a broader trend: do states covered by Section 5 in fact have higher rates of black turnout on a consistent basis? And second, what if anything does this demonstrate about the efficacy of the Voting Rights Act?

One reason to be suspicious of the representativeness of Mississippi and Massachusetts is the high margin of error associated with these calculations, as noted by Nina Totenberg of NPR.

Like other polls, the Current Population Survey is subject to sampling error, a result of collecting data among a random subsample of the population rather than everyone in the state. In states like Massachusetts that have low African-American populations, the margin of error can be especially high: it was plus-or-minus 9.6 percentage points in estimating the black turnout rate in 2004, according to the Census Bureau. Even in Mississippi, which has a larger black population, the margin of error was 5.2 percentage points.

As a general matter, I would prefer that everyone be more careful when citing statistical data, and be more explicit about describing the potential sources of error and uncertainty associated with the calculations. But the headline associated with Ms. Totenberg’s article at NPR makes a strong claim: it asserts that Chief Justice Roberts has “misconstrued” the data by ignoring the margin of error.

In fact, several things can be said in Chief Justice Roberts’s defense. Ms. Totenberg cites 2010 voting rates in her article, when the difference in black turnout between Mississippi and Massachusetts was within the margin of error. But Chief Justice Roberts appears to be referring to a lower-court brief that cited 2004 data instead, when the difference was larger and outside the margin of error.

Furthermore, estimating the degree of uncertainty associated with a statistical estimate is not quite so straightforward as it might seem. There is no bright line at which a particular piece of statistical evidence goes from being meaningful to meaningless.

For example, in a poll of 1,000 people, a candidate who is ahead 52-48 has a 90 percent chance of holding the lead (assuming that there are no other sources of uncertainty apart from sampling error). A candidate who is up 53-47 has a 97 percent chance of holding the lead.

If one applies the conventional definition of the margin of error, which usually refers to a 95 percent confidence interval, then the second candidate’s lead would be described as being outside the margin of error while the first candidate’s would be within it. Nonetheless, the first candidate is still nine times more likely to lead his opponent than to trail him. Conversely, while we can be somewhat more confident about the second candidate’s lead, there is still some chance (3 percent) that he actually trails in the race and that the poll was an outlier. Statistical certainty exists along a continuum of probabilities and not in absolutes; I am therefore reluctant to endorse arguments that rely on semantic distinctions about how terms like “margin of error” or “statistical significance” are applied.

Another problem is that sampling error refers to only one potential source of uncertainty in a poll. In surveys of voting behavior, for example, some voters give erroneous responses: lying about whether they voted, misremembering whether they did so or being uncertain about whether their ballot was ultimately counted. This measurement error is in addition to sampling error and will not be accounted for by the margin of error. Further errors can be introduced by the polling method: since some people are more likely to respond to surveys than others, the sample may be biased in some way rather than being truly random. Thus, the true degree of uncertainty in a polling result is usually larger than implied by the margin of error alone.

The debate might be more constructive if we return to the substantive questions that I posed earlier. First, are the voting rates in Massachusetts and Mississippi representative of a broader trend? If so, it seems wrong to suggest that Chief Justice Roberts misconstrued the data merely because he failed to mention the margin of error. But if Massachusetts and Mississippi are outliers, then the chief justice may be guilty as NPR contends. One might draw a parallel to last year’s election campaign, when some Web sites consistently highlighted polls that showed Mitt Romney performing well, ignoring the broader consensus of polls that had President Obama with the lead for most of the campaign in most swing states. Cherry-picking the evidence in this way is the greater statistical sin, in my view, since it involves making misleading rather than merely imprecise claims.

In fact, it would be dangerous to infer very much from Massachusetts and Mississippi. In 2004, for instance, while Mississippi was reported to have strong black turnout, black turnout was poor in Arizona and Virginia, which are also covered by Section 5.

In the chart below, I have aggregated the 2004 turnout data into two groups of states, based on whether or not they are covered by Section 5. (I ignore states like New York where some counties are subject to Section 5 but others are not.) In the states covered by Section 5, the black turnout rate was 59.2 percent in 2004, while it was 60.8 percent in the states that are not subject to it. The ratio of white-to-black was 1.09 in the states covered by Section 5, but 1.12 in the states that are not covered by it. These differences are not large enough to be meaningful in either a statistical or a practical sense.

Photo

So did Chief Justice Roberts misconstrue the data? If he meant to suggest that states covered by Section 5 consistently have better black turnout rates than those that aren’t covered by the statute, then his claim is especially dubious. However, the evidence does support the more modest claim that black turnout is no worse in states covered by Section 5. There don’t seem to be consistent differences in turnout rates based on whether states are covered by Section 5 or not.

The bigger potential flaw with Chief Justice Roberts’s argument is not with the statistics he cites but with the conclusion he draws from them.

Most of you will spot the logical fallacy in the following claim:

No aircraft departing from a United States airport has been hijacked since the Sept. 11 attacks, when stricter security standards were implemented. Therefore, the stricter security is unnecessary.

As much as I might want to be sympathetic to this claim (I fly a lot and am wary of the “security theater” at American airports), it ought not to be very convincing as a logical proposition. The lack of hijackings were in part a product of an environment in which airport security was quite strict, and says little about what would happen if these countermeasures were removed. The same data might just as easily be cited as evidence that the extra security had been effective:

No aircraft departing from a United States airport has been hijacked since the Sept. 11 attacks, when stricter security standards were implemented. Therefore, the stricter security is working.

Similarly, the fact that black turnout rates are now roughly as high in states covered by Section 5 might be taken as evidence that the Voting Rights Act has been effective. There were huge regional differences in black turnout rates in the early 1960s, before the Voting Rights Act was passed. (In the 1964 election, for example, nonwhite turnout was about 45 percent in the South, but close to 70 percent elsewhere in the country.) These differences have largely evaporated now.

Photo

How much of this is because of the Voting Rights Act, as opposed to other voter protections that have been adopted since that time, or other societal changes? And even if the Voting Rights Act has been important in facilitating the changes, how many of the gains might be lost if the Section 5 requirements were dropped now?

These are difficult questions that the Supreme Court faces. They are questions of causality – and as any good lawyer knows, establishing a chain of causality is often the most difficult chore in a case.

Statistical analysis can inform the answers if applied thoughtfully. But statistics can obscure the truth when they become divorced from the historical, legal and logical context of a case.